Interested in generating images like this from SD3?

Prompt

a three fourth perspective portrait view of a young woman with messy blonde hair and light purple eyes, looking at viewer with a closed mouth smile, slightly visible right pointy fantasy ear, wearing a black feather hair tie on right side of hair, wearing a pink feather above right ear, wearing silver earrings, wearing baggy white collared shirt with a black cloak wrapped around shoulders, bright yellow rim light hitting left side of face, cropped, a faded pink simple background during golden hour

Base model

ComfyUI_temp_tvxci_00001_.png

Fine-tuned model

image.png

Prompt

A person stands in the foreground with their back turned to the camera, appearing to be about to enter a doorway. They have short hair and are dressed in casual clothing. The background features a misty, dimly-lit street lined with cars, old building facades, and a brightly lit gas station sign that reads "iperoil" with prices 1.775 and 1.699 visible. The style conveys a gritty, realistic urban environment, highlighted by the vintage design of the gas station sign. The scene appears to be set late at night or early dawn, with moody, greenish lighting shrouded in fog, giving a sense of quiet solitude and contemplation.

Base model

ComfyUI_temp_tvxci_00038_.png

Fine-tuned model

rgthree.compare.temp_hspij_00080.png

Prompt

a front wide view of a small cyberpunk city with futuristic skyscrapers with gold rooftops situated on the side of a cliff overlooking an ocean, day time view with green tones, some boats floating in the foreground on top of reflective orange water, large mechanical robot structure reaching high above the clouds in the far background, atmospheric perspective, teal sky

Base model

ComfyUI_temp_jsmxq_00039_.png

Fine-tuned model

ComfyUI_temp_kffbb_00040_.png

Target Audience: Engineers or technical people with at least basic familiarity with finetuning

Purpose: Understand the difference between fine-tuning SD1.5/SDXL and Stable Diffusion 3 Medium (SD3M) and enable more users to fine-tune on SD3M

Introduction

Hello! My name is Yeo Wang, and I’m a Generative Media Solutions Engineer at Stability AI (and freelance 2D/3D concept designer). You might have seen some of my videos on YouTube or know about me through the community (Github). Personally, I’ve received decent results when training SD3 Medium, so I’ll share some insights and quick-start configurations for both full fine-tuning and LoRA training.

Also, as a side bonus, keep reading for a sneak peek of our upcoming image model~ 🙂

Tools

Out of the tools available, I’ve chosen to go with SimpleTuner toolkit from bghira (developer of SimpleTuner) as it gave me the best results. As such, I won’t be covering tools from kohya-ss (sd-scripts), Nerogar (OneTrainer), or huggingface (diffusers).

SimpleTuner