Create a Character Expression Matrix
An expression matrix is a 2D grid: emotion (happy, sad, angry, surprised, disgusted, fearful) on one axis, intensity (slight, moderate, extreme) on the other. The output is a 6×3 or 4×3 cell sheet that animators, VTuber riggers, and game UI designers use as expression source material. Unlike a flat expression sheet, the matrix forces you to commit to gradations — "slightly angry" must read different from "furious" while still being the same character. Answer: Use the head-and-shoulders crop of your character at front three-quarter, lock the seed, and run one prompt per cell with explicit emotion + intensity tokens ("slightly happy" vs "joyful" vs "ecstatic"). Reference strength 0.8 holds identity while letting the face deform. Most pipelines use Ekman's six basic emotions × three intensities = 18 cells.
- 01
Choose your axes
Standard is Ekman six (happy, sad, angry, surprised, disgust, fear) by three intensities. For VTuber rigs use four emotions by three intensities to match common rig blendshapes.
- 02
Crop your reference to head-and-shoulders
Tight crops give the model more pixels to work with on face features. Front three-quarter is the most expressive angle and translates cleanly to rig source.
- 03
Build the prompt template
Same prompt skeleton for every cell, only the emotion + intensity tokens change. "slightly angry" / "moderately angry" / "furious" — concrete adjectives, not 1–10 scales.
- 04
Generate one cell at a time
Lock the seed, reference strength 0.8, generate one cell, save with a clear filename (emotion_intensity.png). Batching loses control over which cell drifts.
- 05
Audit for gradient
Place each row side by side. Slight → moderate → extreme should read as a clean ramp, not three random snapshots. If moderate looks closer to slight, regenerate moderate with stronger intensity language.
- Use Ekman's six basic emotions as your default vertical axis — they map cleanly to rig blendshapes
- Three intensity levels is the floor; five (subtle / slight / moderate / strong / extreme) gives smoother gradients but doubles generation cost
- Reference strength 0.8 is the sweet spot — 0.9+ refuses to deform the mouth, 0.7 drifts identity
- Mouth state matters more than eyebrows for emotion read at thumbnail scale — call out mouth shape explicitly
- Surprise and fear share eye-widening; differentiate them with mouth (surprise: open round / fear: pulled back)
- Disgust is the hardest emotion to generate cleanly — most models default to mild frown; add "wrinkled nose, raised upper lip"
- Lock the head tilt and lighting in the prompt or each cell ends up at a slightly different angle, killing the grid read
- For animation source, also generate a neutral baseline as the matrix origin (intensity zero) so riggers have a rest pose
Ready to create consistent character views?
Upload a reference image and generate multi-angle views that stay true to your character.
Start generating