🎯 Ehyra LoRA Training — Z-Image Base

Ehyra is a character developed within the BrierStudios ecosystem, designed for use across visual media (illustrations, animations, avatar generation). This document covers the LoRA training pipeline using the Z-Image Base model.

Overview

LoRA (Low-Rank Adaptation) allows us to fine-tune existing diffusion models with a small, trainable adapter — producing character-consistent output without retraining the entire base model.

┌─────────────────────────────────────────────────┐
│              LoRA Training Pipeline               │
│                                                   │
│  ┌──────────┐   ┌──────────┐   ┌──────────┐    │
│  │ Dataset  │──→│ Training │──→│ LoRA      │    │
│  │ Prep     │   │ Config   │   │ Export    │    │
│  └──────────┘   └──────────┘   └──────────┘    │
│       │              │              │             │
│       ▼              ▼              ▼             │
│  ┌──────────┐   ┌──────────┐   ┌──────────┐    │
│  │ Captions │   │ Kohya/SD │   │ Validation│    │
│  │ & Tags   │   │ Scripts  │   │ & Test    │    │
│  └──────────┘   └──────────┘   └──────────┘    │
└─────────────────────────────────────────────────┘

Prerequisites

Requirement	Version	Notes
Python	3.10+	Required for training scripts
CUDA	11.8+	GPU training required
VRAM	12GB+ minimum	24GB recommended
Z-Image Base	v1.5+	Base model for LoRA
Kohya_ss	Latest	Training framework
Dataset	30–100 images	Character reference images

Step 1: Dataset Preparation

Image Collection

Gather 30–100 high-quality reference images of Ehyra. Quality over quantity.

Requirements:

Minimum resolution: 512×512 pixels
Recommended resolution: 768×768 or 1024×1024
Format: PNG (lossless)
Consistent character design across all images

Image distribution:

40% head/bust shots
30% full body
20% half body
10% detail shots (eyes, outfit details, accessories)

Naming Convention

ehyra_<number>_<type>_<variant>.png

Examples:
  ehyra_001_head_neutral.png
  ehyra_002_full_casual.png
  ehyra_003_bust_smiling.png

Captioning

Each image needs a corresponding .txt caption file with the same name:

ehyra_001_head_neutral.png
ehyra_001_head_neutral.txt    ← Caption file

Caption format:

ehyra, <hair description>, <eye description>, <outfit description>, <pose>, <expression>, <background type>, <lighting>

Example caption:

ehyra, long silver hair, cyan tips, cyan eyes, dark techwear jacket, neon trim, forward facing, serious expression, dark background, neon rim lighting, cyberpunk style

Tag Strategy

Use a trigger word as the first tag in every caption. This is the word that activates the LoRA during generation.

Trigger word: ehyra

tip

Keep your trigger word unique and short. Avoid common words that might conflict with existing model concepts. "ehyra" works well because it's unique to this character.

Step 2: Training Configuration

LoRA Parameters

Create a training config file:

ehyra_lora_config.toml
[model]
pretrained_model = "Z-Image/Base/v1.5"
trigger_word = "ehyra"

[network]
type = "lora"          # LoRA (not LoCon/Full)
dim = 32                # Rank dimension — good balance for characters
alpha = 16              # Network alpha (dim/2 is standard)
dropout = 0.1           # Prevent overfitting
algo = "lora"           # Algorithm

[dataset]
image_dir = "./dataset/ehyra"
batch_size = 2
resolution = 768
enable_bucket = true    # Aspect ratio bucketing
min_bucket_reso = 384
max_bucket_reso = 1024

[training]
epochs = 15             # 15 epochs for 50-80 images
learning_rate = 1e-4    # 0.0001
lr_scheduler = "cosine"
warmup_steps = 100
gradient_accumulation = 2
mixed_precision = "bf16"  # Use fp16 if no bf16 support
save_every_n_epochs = 3
seed = 42

[output]
output_dir = "./output/ehyra-lora"
output_name = "ehyra_v1_zimage"
save_model_as = "safetensors"

Parameter Explanation

Parameter	Value	Why
`dim`	32	Sufficient for character definition. Higher = more detail but risk of overfit. Lower = faster but less flexibility.
`alpha`	16	Half of dim. Controls learning impact per step.
`dropout`	0.1	Gentle regularization to prevent memorizing training images.
`epochs`	15	With 50 images and batch 2, this gives ~375 training steps per epoch.
`learning_rate`	1e-4	Standard for LoRA. Lower (5e-5) if overfitting, higher (2e-4) if underfitting.
`resolution`	768	Z-Image Base works well at 768. Use 1024 only if you have the VRAM.

Step 3: Training Execution

# Activate your training environment
conda activate kohya

# Start training
accelerate launch train_network.py \
  --config_file ehyra_lora_config.toml

# Or using the Kohya GUI — load ehyra_lora_config.toml

Monitor training:

Watch the loss curve — it should decrease steadily
If loss plateaus early, reduce learning rate
If loss oscillates wildly, reduce learning rate and increase warmup

# View training logs
tensorboard --logdir ./output/ehyra-lora/logs

Step 4: Validation & Testing

After training, test the LoRA with various prompts:

Test Prompts

# Basic character test
prompt = "ehyra, 1girl, silver hair, cyan eyes, dark techwear, cyberpunk style, dark background"
negative = "lowres, bad anatomy, extra digits, worst quality, low quality, blurry"

# Expression variation
prompt = "ehyra, 1girl, silver hair, cyan eyes, smiling, happy expression, neon city background"

# Action pose
prompt = "ehyra, 1girl, silver hair, cyan eyes, dynamic pose, fighting stance, neon glow"

# Casual scene
prompt = "ehyra, 1girl, silver hair, cyan eyes, sitting, relaxed, casual outfit, indoor scene"

Quality Checklist

Character is recognizable as "ehyra" with LoRA weight 0.7–1.0
Silver hair with cyan tips renders consistently
Cyan eyes are visible and correctly colored
Techwear outfit appears without explicit prompting
No artifacts in the background when using low LoRA weights
Character can switch expressions (happy, serious, surprised)
Multiple poses work without breaking character details
No bleeding of training image compositions

Step 5: Export & Integration

Export Format

# The output will be in safetensors format
ls ./output/ehyra-lora/
# ehyra_v1_zimage.safetensors

Integration with Z-Image Base

In your generation pipeline:

from diffusers import StableDiffusionPipeline

# Load base model
pipe = StableDiffusionPipeline.from_pretrained("Z-Image/Base/v1.5")

# Load LoRA weight
pipe.load_lora_weights("./output/ehyra-lora/ehyra_v1_zimage.safetensors")

# Generate with character
image = pipe(
    prompt="ehyra, 1girl, silver hair, cyan eyes, serious expression, dark background",
    negative_prompt="lowres, bad anatomy, worst quality",
    num_inference_steps=30,
    guidance_scale=7.5,
    cross_attention_kwargs={"scale": 0.8},  # LoRA strength 0.8
).images[0]

LoRA Weight Guidance

Weight	Effect	Use Case
0.3–0.5	Subtle influence	Style mixing with other LoRAs
0.6–0.8	Balanced	Default generation, good character fidelity
0.9–1.0	Strong influence	Maximum character consistency, risk of artifacts
1.0+	Overfit territory	Rarely recommended, may produce training image copies

Version Naming

ehyra_v<VERSION>_zimage.safetensors

Examples:
  ehyra_v1_zimage.safetensors   — First training run
  ehyra_v1.1_zimage.safetensors — Bug fix / retune
  ehyra_v2_zimage.safetensors   — New dataset or major changes

Troubleshooting

Overfitting

Symptoms: Character only appears in training poses/scenes; images look identical to training data.

Solutions:

Reduce dim from 32 to 16
Reduce learning_rate from 1e-4 to 5e-5
Increase dropout from 0.1 to 0.2
Add more varied images to the dataset
Reduce epochs from 15 to 10

Underfitting

Symptoms: Character doesn't appear; LoRA seems to have no effect.

Solutions:

Increase dim from 32 to 64
Increase learning_rate from 1e-4 to 2e-4
Increase epochs from 15 to 25
Verify your captions include the trigger word

Color Bleeding

Symptoms: Character colors leak into backgrounds or other elements.

Solutions:

Improve captions — describe backgrounds separately
Reduce LoRA weight during generation
Add color-specific negative prompts

From dataset to deployment — the pipeline that brings characters to life. 🎯⚡

Overview​

Prerequisites​

Step 1: Dataset Preparation​

Image Collection​

Naming Convention​

Captioning​

Tag Strategy​

Step 2: Training Configuration​

LoRA Parameters​

Parameter Explanation​

Step 3: Training Execution​

Step 4: Validation & Testing​

Test Prompts​

Quality Checklist​

Step 5: Export & Integration​

Export Format​

Integration with Z-Image Base​

LoRA Weight Guidance​

Version Naming​

Troubleshooting​

Overfitting​

Underfitting​

Color Bleeding​

Overview

Prerequisites

Step 1: Dataset Preparation

Image Collection

Naming Convention

Captioning

Tag Strategy

Step 2: Training Configuration

LoRA Parameters

Parameter Explanation

Step 3: Training Execution

Step 4: Validation & Testing

Test Prompts

Quality Checklist

Step 5: Export & Integration

Export Format

Integration with Z-Image Base

LoRA Weight Guidance

Version Naming

Troubleshooting

Overfitting

Underfitting

Color Bleeding