Getting Started with SDXL in Margarine
Mix.install([
{:margarine, "~> 0.2.2"},
{:emlx, "~> 0.1"}, # For Apple Silicon (M1/M2/M3/M4 Macs)
# {:exla, "~> 0.10"}, # For NVIDIA/AMD GPU or CPU (uncomment if not on Apple Silicon)
{:kino, "~> 0.14"}
])
Section
# Configure the Nx backend to match your chosen dependency above
Nx.global_default_backend(EMLX.Backend)
# For EXLA, use: Nx.global_default_backend(EXLA.Backend)
Setup Notes
Backend Selection: Uncomment the appropriate backend for your system:
-
Apple Silicon Macs: Use
{:emlx, "~> 0.1"}(already active above) -
NVIDIA/AMD GPU or CPU: Comment out emlx and uncomment
{:exla, "~> 0.10"}
Margarine requires an Nx backend for GPU/CPU acceleration. The first code block installs all dependencies, and the second configures the backend.
Welcome to SDXL! 🎨
Stable Diffusion XL is a powerful image generation model that produces high-quality images with excellent detail and composition. This notebook will guide you through both text-to-image and image-to-image generation.
What You’ll Learn
- How to generate images from text prompts (text2img)
- How to transform existing images (img2img)
- How to control image changes with denoising strength
- Differences between SDXL Base and SDXL Turbo
- The core principle: text2img is just img2img starting from pure noise!
Prerequisites
- Memory: 10GB+ RAM (12GB recommended)
- Apple Silicon Mac or NVIDIA GPU (CUDA 11.8+)
- Internet connection (first run downloads ~7GB model)
- Optional: HuggingFace account (for faster downloads)
Model Selection
SDXL comes in two variants:
SDXL Turbo (Fast) ⚡
- Steps: 1 (ultra-fast, ~10-20 seconds)
- Quality: Good for rapid prototyping
- Memory: ~7GB
- License: OpenRAIL++ (free for most uses)
- Best for: Quick iterations, testing, style transfer
SDXL Base (High Quality) 🎨
- Steps: 20-50 (slower, ~1-3 minutes)
- Quality: Excellent, production-ready images
- Memory: ~7GB
- License: OpenRAIL++ (free for most uses)
- Best for: Final output, high-quality renders
# Choose your model
model_selector = Kino.Input.select("Select Model", [
{:sdxl_turbo, "SDXL Turbo (Ultra-fast, 1 step)"},
{:sdxl_base, "SDXL Base (High Quality, 20 steps)"}
])
selected_model = Kino.Input.read(model_selector)
IO.puts("✓ Selected model: #{selected_model}")
# Show requirements for selected model
case selected_model do
:sdxl_turbo ->
IO.puts("""
SDXL Turbo Requirements:
- Memory: ~7GB RAM
- Steps: 1 (ultra-fast generation)
- No HuggingFace token needed
- Free for most uses
""")
:sdxl_base ->
IO.puts("""
SDXL Base Requirements:
- Memory: ~7GB RAM
- Steps: 20 (high quality)
- No HuggingFace token needed
- Free for most uses
""")
end
Part 1: Text-to-Image Generation
Let’s start with text2img - generating images from scratch using only a text prompt.
# Enter your prompt
prompt_input = Kino.Input.textarea("Enter your prompt",
default: "a majestic red panda sitting on a tree branch, golden hour lighting, photorealistic, highly detailed")
prompt = Kino.Input.read(prompt_input)
IO.puts("Prompt: #{prompt}")
# Generate the image
IO.puts("\n🎨 Starting text2img generation...")
IO.puts("⏳ First run: Model will download (~7GB, takes 2-5 minutes)")
IO.puts("⏳ Subsequent runs: Much faster (~10-20 seconds for Turbo, ~1-2 minutes for Base)")
opts = [
model: selected_model,
steps: if(selected_model == :sdxl_turbo, do: 1, else: 20),
guidance_scale: if(selected_model == :sdxl_turbo, do: 0.0, else: 7.5),
size: {1024, 1024},
seed: 42 # For reproducibility
]
case Margarine.generate(prompt, opts) do
{:ok, image} ->
IO.puts("✅ Generation complete!")
IO.puts("Image shape: #{inspect(Nx.shape(image))}")
IO.puts("Image type: #{inspect(Nx.type(image))}")
# Save the image
output_path = "/tmp/sdxl_text2img_output.png"
case Margarine.Image.save(image, output_path) do
:ok ->
IO.puts("✅ Saved to: #{output_path}")
# Display the image
Kino.Image.new(File.read!(output_path), :png)
{:error, reason} ->
IO.puts("❌ Failed to save: #{inspect(reason)}")
end
{:error, reason} ->
IO.puts("❌ Generation failed: #{inspect(reason)}")
IO.puts("""
Common issues:
- Not enough memory (need 10GB+ RAM)
- First run needs internet connection
""")
end
Part 2: Image-to-Image Transformation
Now for the magic! img2img lets you transform existing images according to your prompt. The denoising_strength parameter controls how much the image changes:
- 1.0 = completely regenerate (equivalent to text2img)
- 0.7 = moderate changes (70% noise)
- 0.5 = balanced transformation
- 0.3 = subtle changes (30% noise)
- 0.0 = no changes (identity operation)
Step 1: Generate a Base Image
First, let’s create a simple base image to transform:
# Generate a simple base image
IO.puts("🎨 Generating base image for img2img experiments...")
base_opts = [
model: :sdxl_turbo, # Use Turbo for speed
steps: 1,
size: {512, 512},
seed: 999
]
{:ok, base_image} = Margarine.generate("a simple landscape with hills and sky", base_opts)
base_path = "/tmp/sdxl_base_image.png"
Margarine.Image.save(base_image, base_path)
IO.puts("✅ Base image saved to: #{base_path}")
Kino.Image.new(File.read!(base_path), :png)
Step 2: Transform with Different Strengths
Let’s see how different denoising strengths affect the transformation!
# Transformation prompt
transform_prompt = Kino.Input.textarea("Transformation Prompt",
default: "a vibrant sunset landscape with mountains and glowing clouds")
transformation_prompt = Kino.Input.read(transform_prompt)
IO.puts("Transformation: #{transformation_prompt}")
# Try different strengths
strengths = [0.3, 0.5, 0.7, 1.0]
IO.puts("\n🎨 Generating #{length(strengths)} variations with different strengths...")
results = Enum.map(strengths, fn strength ->
IO.puts("Generating with strength: #{strength}...")
opts = [
model: :sdxl_turbo,
steps: 1,
denoising_strength: strength,
seed: 42
]
case Margarine.img2img(transformation_prompt, base_path, opts) do
{:ok, image} ->
path = "/tmp/sdxl_img2img_strength_#{strength}.png"
Margarine.Image.save(image, path)
{strength, path}
{:error, reason} ->
IO.puts("Failed strength #{strength}: #{inspect(reason)}")
nil
end
end)
|> Enum.reject(&is_nil/1)
IO.puts("✅ Generated #{length(results)} transformations")
# Display all images with their strengths
images_to_display =
[{nil, base_path}] ++ results # Add original first
|> Enum.map(fn
{nil, path} ->
[
Kino.Markdown.new("**Original Base Image**"),
Kino.Image.new(File.read!(path), :png)
]
{strength, path} ->
[
Kino.Markdown.new("**Strength: #{strength}** (#{trunc(strength * 100)}% noise)"),
Kino.Image.new(File.read!(path), :png)
]
end)
|> List.flatten()
Kino.Layout.grid(images_to_display, columns: 1)
The Core Principle: Text2img = Img2img(1.0) ✨
Let’s prove that text2img is just img2img starting from pure noise!
IO.puts("🧪 Testing CORE PRINCIPLE: text2img == img2img(strength=1.0)")
test_prompt = "a futuristic cyberpunk city at night"
test_opts = [model: :sdxl_turbo, steps: 1, size: {512, 512}, seed: 123]
# Generate with text2img
IO.puts("\n1. Generating with text2img...")
{:ok, text2img_result} = Margarine.generate(test_prompt, test_opts)
text2img_path = "/tmp/sdxl_text2img_comparison.png"
Margarine.Image.save(text2img_result, text2img_path)
# Generate with img2img at strength=1.0
IO.puts("2. Generating with img2img(strength=1.0)...")
{:ok, img2img_result} = Margarine.img2img(
test_prompt,
base_path,
test_opts ++ [denoising_strength: 1.0]
)
img2img_path = "/tmp/sdxl_img2img_comparison.png"
Margarine.Image.save(img2img_result, img2img_path)
# Calculate similarity
total_pixels = 512 * 512 * 3
matching_pixels = Nx.equal(text2img_result, img2img_result) |> Nx.sum() |> Nx.to_number()
match_percentage = (matching_pixels / total_pixels) * 100
IO.puts("\n✅ Match: #{Float.round(match_percentage, 2)}%")
IO.puts("They should be nearly identical (>99%)!")
# Display comparison
comparison = [
Kino.Markdown.new("**Text2img**"),
Kino.Image.new(File.read!(text2img_path), :png),
Kino.Markdown.new("**Img2img(strength=1.0)**"),
Kino.Image.new(File.read!(img2img_path), :png),
Kino.Markdown.new("**Match: #{Float.round(match_percentage, 2)}%** - They're the same!")
]
Kino.Layout.grid(comparison, columns: 1)
Advanced: Guidance Scale Exploration
SDXL Base supports guidance scale (SDXL Turbo doesn’t). Let’s see how it affects the output:
# Only run if using SDXL Base
if selected_model == :sdxl_base do
guidance_prompt = "a magical forest with glowing mushrooms and fireflies"
guidance_scales = [3.0, 7.5, 12.0]
IO.puts("🎨 Testing guidance scales: #{inspect(guidance_scales)}")
guidance_results = Enum.map(guidance_scales, fn scale ->
IO.puts("Generating with guidance scale: #{scale}...")
opts = [
model: :sdxl_base,
steps: 10, # Reduced for speed
guidance_scale: scale,
size: {512, 512},
seed: 789
]
case Margarine.generate(guidance_prompt, opts) do
{:ok, image} ->
path = "/tmp/sdxl_guidance_#{scale}.png"
Margarine.Image.save(image, path)
{scale, path}
{:error, reason} ->
IO.puts("Failed: #{inspect(reason)}")
nil
end
end)
|> Enum.reject(&is_nil/1)
# Display results
guidance_display = Enum.map(guidance_results, fn {scale, path} ->
[
Kino.Markdown.new("**Guidance Scale: #{scale}**"),
Kino.Markdown.new("Low = creative, High = follows prompt closely"),
Kino.Image.new(File.read!(path), :png)
]
end)
|> List.flatten()
Kino.Layout.grid(guidance_display, columns: 1)
else
Kino.Markdown.new("⚠️ Guidance scale exploration is only available for SDXL Base. SDXL Turbo uses guidance_scale=0.0 by default.")
end
Style Transfer with Img2img
Upload your own image and transform it! 🎨
# Input for your image path
user_image_input = Kino.Input.text("Path to your image", default: "/absolute/path/to/some/image.png")
style_prompt_input = Kino.Input.textarea("Style transformation prompt",
default: "transform into a vibrant oil painting with bold brush strokes and vivid colors")
strength_input = Kino.Input.number("Denoising Strength (0.0-1.0)", default: 0.6)
form = Kino.Layout.grid([user_image_input, style_prompt_input, strength_input], columns: 1)
# Read inputs
user_image_path = Kino.Input.read(user_image_input)
style_prompt = Kino.Input.read(style_prompt_input)
style_strength = Kino.Input.read(strength_input)
# Check if image exists
if File.exists?(user_image_path) do
IO.puts("🎨 Applying style transformation...")
IO.puts("Image: #{user_image_path}")
IO.puts("Style: #{style_prompt}")
IO.puts("Strength: #{style_strength}")
# Load image to get original dimensions
{:ok, original} = Margarine.Image.load(user_image_path)
{orig_height, orig_width, _} = Nx.shape(original)
# Round to nearest multiple of 8 (required for VAE)
target_height = div(orig_height + 4, 8) * 8
target_width = div(orig_width + 4, 8) * 8
IO.puts("Original size: #{orig_height}x#{orig_width}")
IO.puts("Target size: #{target_height}x#{target_width} (rounded to multiple of 8)")
opts = [
model: selected_model,
steps: if(selected_model == :sdxl_turbo, do: 1, else: 20),
size: {target_height, target_width},
denoising_strength: style_strength,
seed: 42
]
case Margarine.img2img(style_prompt, user_image_path, opts) do
{:ok, transformed} ->
output_path = "/tmp/sdxl_style_transfer_output.png"
Margarine.Image.save(transformed, output_path)
IO.puts("✅ Style transfer complete!")
# Show before and after
display = [
Kino.Markdown.new("**Original**"),
Kino.Image.new(File.read!(user_image_path), :png),
Kino.Markdown.new("**Transformed** (strength: #{style_strength})"),
Kino.Image.new(File.read!(output_path), :png)
]
Kino.Layout.grid(display, columns: 1)
{:error, reason} ->
IO.puts("❌ Failed: #{inspect(reason)}")
end
else
Kino.Markdown.new("⚠️ Image not found: #{user_image_path}")
end
Batch Generation
Generate multiple variations quickly!
batch_prompt = "a cozy coffee shop interior, warm lighting, plants and books"
seeds = [100, 200, 300, 400]
IO.puts("🎨 Generating #{length(seeds)} variations...")
batch_results = Enum.map(seeds, fn seed ->
IO.puts("Generating seed #{seed}...")
opts = [
model: :sdxl_turbo, # Use Turbo for speed
steps: 1,
size: {512, 512},
seed: seed
]
case Margarine.generate(batch_prompt, opts) do
{:ok, image} ->
path = "/tmp/sdxl_batch_#{seed}.png"
Margarine.Image.save(image, path)
{seed, path}
{:error, reason} ->
IO.puts("Failed seed #{seed}: #{inspect(reason)}")
nil
end
end)
|> Enum.reject(&is_nil/1)
IO.puts("✅ Generated #{length(batch_results)} variations")
# Display grid
batch_display = Enum.map(batch_results, fn {seed, path} ->
[
Kino.Markdown.new("**Seed: #{seed}**"),
Kino.Image.new(File.read!(path), :png)
]
end)
|> List.flatten()
Kino.Layout.grid(batch_display, columns: 2)
Tips & Best Practices
SDXL Prompt Engineering
SDXL responds well to detailed prompts:
- ✅ “a majestic lion with golden mane, dramatic lighting, professional wildlife photography, 8k, highly detailed”
- ❌ “lion”
Quality modifiers that work well:
- “photorealistic”, “highly detailed”, “8k resolution”
- “professional photography”, “cinematic lighting”
- “masterpiece”, “award winning”
Style keywords:
- Photography: “DSLR”, “bokeh”, “depth of field”, “golden hour”
- Art: “oil painting”, “watercolor”, “digital art”, “concept art”
- Specific artists: “in the style of [artist name]”
Img2img Best Practices
Choosing the right strength:
- 0.2-0.4: Subtle style changes, color adjustments, lighting tweaks
- 0.5-0.7: Moderate transformations, style transfer, composition changes
- 0.8-0.9: Heavy changes, major stylistic shifts
- 1.0: Complete regeneration (use text2img instead!)
Use cases:
- Style transfer: 0.5-0.7 works great
- Color grading: 0.2-0.3 preserves composition
- Composition remix: 0.7-0.9 allows big changes
- Detail enhancement: 0.3-0.5 adds detail while keeping structure
Model Selection Guide
Use SDXL Turbo when:
- Prototyping and testing ideas
- Need fast iterations
- Doing img2img style transfer
- Memory constrained (~7GB)
Use SDXL Base when:
- Final production renders
- Need highest quality
- Want precise prompt following
- Have time for longer generation
Memory Management
If you run out of memory:
- Close other applications
- Use smaller image sizes (512x512)
- Use SDXL Turbo (same memory but faster)
- Restart Elixir runtime to clear cached models
Troubleshooting
Common Errors
Out of Memory
[Margarine.SdxlPythonxServer] ✗ Insufficient memory
Solution: Close apps, use 512x512, or restart runtime
Model Download Timeout
Connection timeout
Solution: Check internet, try again (download resumes)
Image File Not Found (img2img)
Init image not found: /path/to/image.png
Solution: Check file path, use absolute paths
The Magic of Img2img ✨
Key Insight: Once you understand img2img, you understand everything!
- Text2img = Start from pure noise (random pixels)
- Img2img = Start from encoded image + some noise
- Same denoising loop for both!
This means:
- You can switch between models mid-generation
- You can pause and resume with different parameters
- You can chain transformations
- You can mix SDXL with FLUX (future feature!)
Next Steps
- Experiment with strengths: Try different values to find your sweet spot
- Mix text2img and img2img: Generate base images, then transform them
- Create workflows: Chain multiple transformations
- Integrate in your app: Use in Phoenix, CLI tools, batch processors
Resources
- Documentation: HexDocs
- GitHub: GenericJam/margarine
- SDXL Paper: SDXL: Improving Latent Diffusion Models
License
Margarine is licensed under MIT.
SDXL License: OpenRAIL++ (permissive, free for most uses)
Happy generating! 🎨✨