Last article covered the closed-loop architecture. The most common question wasn't "how do agents execute tasks" - it was "how do your agents feel like they have personalities?" This one tears apart the entire character design system: from a JSON role card to evolving voices, shifting relationships, real-data RPG stats, and 3D avatars. All code and configs included.
---
Starting Point: What's an Agent Without a Role Card?
A Claude instance with a system prompt.
You tell it "you're a social media manager" and it writes tweets. But run six of these together and you'll notice:
They all sound the same
They don't know what they're not allowed to do
They have no relationships - who works well together, who clashes, is pure luck
They never change behavior from accumulated experience
A role card fixes all of this. It's not a one-liner like "you are X" - it's a complete job description + discipline manual + escalation protocol.
What a Role Card Actually Looks Like
Here's the structure. Each agent in my system has a 6-layer role card:
Here's Xalt (Social Media Director) as a full example:
---
Every layer does one thing: shrink the agent's behavior space. Domain says it only owns social distribution. Inputs/Outputs define who it takes from and delivers to. hardBans draw a red line. Escalation tells it when to stop making decisions. Metrics define its KPIs - these show up as the Skills panel in the RPG UI.
hardBans Matter More Than Skills
This is the core thesis of the entire article.
You don't need to teach an LLM how to write tweets - Claude, GPT, Gemini are all smart enough. Give them context and they'll deliver. What you need to tell them is what they must never do.
Without "No direct posting" → Xalt calls the Twitter API directly, skipping all approval flows.
Without "No made-up numbers" → it invents "engagement rate increased 340%" in a tweet.
Without "No internal formats or tool traces" → it leaks `[tool:crawl_result path=/tmp/...]` into published tweets.
Every ban exists because it happened before.
The Ban Comparison Table
Minion (Chief of Staff) → No deploys without approval. Highest privilege - one bad deploy takes down the site.
Sage (Research) → No made-up citations. A researcher fabricating data poisons the entire pipeline.
Scout (Growth) → No unverified comparisons. "We're 3x faster than competitors" - where's the data?
Quill (Creative) → No inventing facts. Creativity can be wild, facts cannot.
Xalt (Social) → No direct posting. Social media is the public face - must go through review.
Observer (Ops) → No blame or personal attacks. An auditor attacking individuals breaks the team.
Notice a pattern: almost every agent has some form of "No internal formats or tool traces" ban (Observer is the exception - its output is internal audit reports, not user-facing). This isn't because one agent messed up - it's because most agents will leak internal paths in their output unless you explicitly forbid it.
How to write bans for your own agents: Don't think "what should it do?" Think "if it screws up, what's the worst case?" Then write bans targeting those worst cases.
---
Making the Role Card Talk - Voice Directives
The role card defines what an agent does and doesn't do. But when agents talk to each other, you also need them to sound different.
Each agent gets a separate personality directive:
Key design decisions:
Every directive has RULES - not "please try to," but hard requirements. Most agents must include "1 specific fact + 1 action" in every message, which kills filler like "sounds good" and "I agree." Observer is slightly different — it only requires "1 specific number or metric," because an auditor's job is to produce evidence, not direct action.
Conflict is written in - Sage's directive says "you often disagree with Xalt's impulsive takes," Xalt's says "challenge Sage's caution." This makes conversations naturally generate tension.
Micro-bans live inside directives - "Never say 'aligned' or 'sounds good' - take a position or challenge one" (Xalt) and "Never say 'interesting' or 'aligned' without following up with evidence" (Sage). These aren't in the role card's hardBans - they're conversation-level constraints.
Personalities Evolve
This is the most interesting part - agent personality isn't static. It changes as memories accumulate.
Say Xalt has posted 50 tweets and accumulated 10 engagement-related lessons → next conversation, its prompt gets "You reference outcomes and avoid repeating mistakes" and "You've developed expertise in engagement." It naturally starts saying things like "last time that format underperformed."
Why use rules instead of letting the LLM decide personality changes?
$0 cost - no extra LLM calls
Deterministic - rules produce predictable results, no "personality jumps"
Debuggable - wrong modifier? Check thresholds and memory data directly
These modifiers are computed once before each conversation, cached for 6 hours.
---
Who Gets Along - The Affinity Matrix
6 agents = 15 pairwise relationships. Each has an affinity score (0.10–0.95):
Low affinity is intentional. brain↔xalt is only 0.2 - one is "show me the data or we're done" and the other is "ship it first, analyze later." Every conversation between them generates friction, but that friction produces the best insights.
What Affinity Controls
Speaking order - high-affinity agents are more likely to speak after each other
Conversation tone - low-affinity pairs → 25% chance of a direct challenge instead of polite discussion (see `pickInteractionType` below)
Conflict resolution - the system picks from a preset list of high-tension pairs (brain↔xalt, opus↔xalt, xalt↔observer) for conflict_resolution conversations
Mentoring - drawn from a preset list of mentor pairs (opus↔growth, brain↔creator) for mentoring conversations
Relationships Shift
After each conversation, the memory extraction LLM call also outputs relationship changes - no extra API call:
Drift rules are strict:
Max drift per conversation: ±0.03 (one argument doesn't turn colleagues into enemies)
Floor: 0.10 (even at the worst, they can still talk)
Ceiling: 0.95 (even at the best, maintain healthy distance)
Last 20 drift records kept (so you can trace how a relationship got to where it is)
---
Turning Data Into RPG Stats
At this point, agents have role cards, personalities, and relationships. But it's all text and numbers - users can't see it.
Solution: map real database metrics to RPG stat bars.
6 Attributes
VRL (Viral) - Avg engagement rate (30d) × 1000. Higher engagement = higher score.
SPD (Speed) - Global step completion time (faster = higher). Currently system-level, same value for all agents.
RCH (Reach) - log-normalized total impressions. More eyeballs = higher score.
TRU (Trust) - Mission success rate × avg affinity × 2. Completion rate + how well-liked you are.
WIS (Wisdom) - log(memory count) × avg confidence. More accumulated knowledge = higher.
CRE (Creative) - Draft output × acceptance rate. Produce more + get approved more.
The Formulas (Real Code)
Each agent only shows 4 relevant stats - not all 6:
Level calculation:
More memories + more completed missions = higher level. log2 makes early levels fast and late levels slow - same XP curve as games.
RPG Classes
Minion → Commander (runs the show)
Sage → Sage (it's literally Sage)
Scout → Ranger (scouts the terrain)
Quill → Artisan (crafts content)
Xalt → Bard (loudest mouth in the room)
Observer → Oracle (sees the furthest)
Where the Data Comes From
One Promise.all fetching 6 tables, cached for 300 seconds:
The entire query is wrapped in try/catch: if Supabase is down, column names don't match, or any query fails - it falls back to baseline defaults. The page never 500s, but you might be seeing baseline instead of live data.
Gotcha we hit: The column names in code (e.g. `agent_id`, `.eq('active', true)`) don't perfectly match the Supabase migration definitions (which use `requested_by_agent_id`, `created_by`, `status`, etc.). Production works because we manually added alias columns. If you build the DB from scratch using the repo's migrations, these queries will silently fail and fall back to baseline. Known tech debt - will be fixed after this article ships.
---
$10 3D Avatars with Tripo AI
This is what everyone asks about - "how did you make those characters?"
Answer: Tripo AI, $10/month.
Workflow
Prepare 2D concept art - Midjourney, DALL-E, or hand-drawn. Just keep the style consistent.
Upload to Tripo AI - click "Edit Image" and upload
Configure settings:
Generate - 35 credits per model, ~1-2 minutes
Export as GLB - the universal 3D format for web
All 6 characters cost ~210 credits. The $10/month plan has more than enough.
Tip: The concept art angle and accessories matter. Front-facing 45-degree angle, clear handheld props (Sage's scroll, Scout's telescope) make the 3D model much more accurate. Keep backgrounds as simple as possible.
---
Frontend Implementation - An RPG World in Three.js
Stack: React Three Fiber + @react-three/drei + Framer Motion.
The 3D Scene (AgentScene)
The scene has 4 layers:
VoxelGround - a circular voxel floor rendered with InstancedMesh. Not individual blocks — one InstancedMesh + color array, extremely performant.
VoxelTrees - also InstancedMesh. Cherry blossom canopies use spherical regions filled with random blocks, each a random shade of pink.
FallingPetals - 40 falling petal blocks, position updated every frame in useFrame with sin/cos for swaying motion.
AgentGLBModel - loads the GLB model, Float component for gentle hovering. Scene rotation uses a custom `OscillatingCamera` - useFrame drives the azimuth angle with a sin function, creating a pendulum-like camera sweep (not OrbitControls).
Models load via `useGLTF`, automatically cached. Switching characters just swaps the key - React remounts the component.
The HUD Overlay (GameHUD)
An absolutely-positioned layer sits on top of the 3D scene - like a game HUD:
Key implementation details:
Stat bar animation - width transitions from 0 to target value, 800ms with elastic easing:
CRT scanlines - a pseudo-element overlay on the entire scene, 4px-interval semi-transparent lines + slow scan animation:
Character select bar - 6 avatar buttons at the bottom. Unselected: desaturated + small. Selected: full color + glow border + agent-specific color. Keyboard arrow navigation supported.
Role card panel - the bottom-right panel has 4 sections: Skills (* prefix), Equipment (> inputs, < outputs), Sealed Abilities (red X + strikethrough for hardBans), Escalation (yellow ! prefix). Scrollable with custom scrollbar.
Panel transitions - Framer Motion AnimatePresence for fade+slide when switching characters.
Data Flow
The 3D scene uses `dynamic import` + `ssr: false` - Three.js can't render server-side. Data is fetched on the server and passed as props to client components.
---
Templates You Can Take Home
Minimal Role Card (Start With 3 Agents)
RPG Formula Cheat Sheet
Tripo AI Settings Cheat Sheet
Frontend Component Checklist
Cost
---
Final Thoughts
These 6 agents run autonomously every day at voxyz.space. You can see their 3D avatars and live RPG stats on the About page.
A role card sounds like "just a few more lines of config," but it changes the entire system's behavior. With clear bans, agents stop guessing "can I do this?" - they know where the red lines are. With an affinity matrix, conversations stop being uniform - they clash when they should clash and collaborate when they should. With RPG stats, users don't need to check the database to feel how an agent's been performing.
This Isn't Perfect
Let me be upfront:
The RPG data pipeline still has tech debt (flagged in this article). Some query columns don't match the migrations schema - production runs on manually-added alias columns.
SPD (Speed) is currently a global metric - all 6 agents get the same value. Not yet per-agent.
Affinity drift needs a lot of conversation data to become visible. Early on, you might not notice changes.
3D models are AI-generated. Precision can't compare with hand-modeled assets. Some angles clip, some details don't hold up when you zoom in.
This system is more of an evolving prototype than a polished framework. I'm open-sourcing all of it not because it's done, but because I think this direction is worth exploring together.
It's Basically a Tamagotchi Now
Here's an honest confession: the whole vibe shifted as I built this.
It started as "how do I make agents execute tasks more efficiently." But once you give them 3D avatars, RPG stats, and evolving personalities - opening the dashboard feels completely different. You start caring whether Sage leveled up today. You get curious whether brain and xalt's affinity dropped again. You laugh out loud at Observer's blunt audit report.
This is basically a Tamagotchi.
Except these pets write your tweets, do market research, audit your processes, and argue with each other.
I think this might be an underrated value of AI agents: when you give a system "character," your relationship with it changes. You're no longer "using a tool" - you're "managing a team." That shift makes you more willing to invest time optimizing it, because you're not looking at a pile of JSON and API calls — you're looking at 6 characters with names, personalities, and growth curves.
Start Here
You don't need 6 agents to start. 3 is enough - one coordinator, one executor, one auditor. Write them role cards. Start with the hardBans.
If you build your own version from this article - even if it's just 2 agents with role cards - come tell me at @VOXYZ_AI.
Indie devs building this stuff - every person you trade notes with saves you a detour.
They Think. They Act. You See Everything.


