Prompt
a three fourth perspective portrait view of a young woman with messy blonde hair and light purple eyes, looking at viewer with a closed mouth smile, slightly visible right pointy fantasy ear, wearing a black feather hair tie on right side of hair, wearing a pink feather above right ear, wearing silver earrings, wearing baggy white collared shirt with a black cloak wrapped around shoulders, bright yellow rim light hitting left side of face, cropped, a faded pink simple background during golden hour
Base model
Fine-tuned model
Prompt
A person stands in the foreground with their back turned to the camera, appearing to be about to enter a doorway. They have short hair and are dressed in casual clothing. The background features a misty, dimly-lit street lined with cars, old building facades, and a brightly lit gas station sign that reads "iperoil" with prices 1.775 and 1.699 visible. The style conveys a gritty, realistic urban environment, highlighted by the vintage design of the gas station sign. The scene appears to be set late at night or early dawn, with moody, greenish lighting shrouded in fog, giving a sense of quiet solitude and contemplation.
Base model
Fine-tuned model
Prompt
a front wide view of a small cyberpunk city with futuristic skyscrapers with gold rooftops situated on the side of a cliff overlooking an ocean, day time view with green tones, some boats floating in the foreground on top of reflective orange water, large mechanical robot structure reaching high above the clouds in the far background, atmospheric perspective, teal sky
Base model
Fine-tuned model
Purpose: Understand the difference between fine-tuning SD1.5/SDXL and Stable Diffusion 3 Medium (SD3M) and enable more users to fine-tune on SD3M
Hello! My name is Yeo Wang, and I’m a Generative Media Solutions Engineer at Stability AI (and freelance 2D/3D concept designer). You might have seen some of my videos on YouTube or know about me through the community (Github). Personally, I’ve received decent results when training SD3 Medium, so I’ll share some insights and quick-start configurations for both full fine-tuning and LoRA training.
Also, as a side bonus, keep reading for a sneak peek of our upcoming image model~ 🙂
Out of the tools available, I’ve chosen to go with SimpleTuner toolkit from bghira (developer of SimpleTuner) as it gave me the best results. As such, I won’t be covering tools from kohya-ss (sd-scripts), Nerogar (OneTrainer), or huggingface (diffusers).