Generate Photos of Yourself by Training a LoRA for Stable Diffusion Privately on AWS
With the growing adoption of AI image generation there have been several apps launched that can generate professional, adventurous or other new photos given a few photos of yourself or a subject. Have you ever wondered how they work? In this article, we will go over how to create and run the underlying model that powers these apps.
Introduction
Stable Diffusion allows the generation of images from textual prompts. However, sometimes the thing you want to generate was not included in the training of the original model.
LoRAs were developed as a way to extend existing models for specific items, faces, styles, etc.
In this article, we will review how to train a LoRA to be able to generate images with a subject such as yourself.
Instructions
Step 1 — Select Images
You will need to select at least 20 images for training. You can select more but at some point the additional benefits are marginal.
The images should feature the subject’s (aka your) face clearly and have the subject as the main feature in the image. For instance, consider the images below. The image on the left has poor lighting and the face is blurry. If your objective is not to generate images under similar conditions the image on the right will likely yield better results.
Attempt to select images such that the subject is the only thing constant in between images. That means that the background, clothing, objects, etc should vary from image to image or be included in the captionss.
Step 2 — Resize Images
Ideally you will train your LORA in an aspect ratio or even exact size that is the same or larger than the one you want to generate your images in.
In this case, we will resize all our training set images to 1024x1024 using Birme.
When resizing ensure your subject remains in the center of the image and preserve as much as possible of your subject’s face or other features you want to be associated with that subject.
The images I used in creating these tutorial are featured above. Note that they are not optimal. For a tutorial on how to select the best images see the video below:
Step 3 — Create Training Instance on AWS
Create an instance on AWS following the instructions below with a few differences:
- Select the G5.2xlarge instance for a faster GPU with more RAM.
- Select an OS image with PyTorch 2.0.1 (instead of 2.1.0) installed.
- Make sure the amount of disk space you select will be enough for your training sets, models and generated LORAs. I recommend at least 100 GB.
Step 4 — Install Training Software
We will be using Kohya’s Stable Diffusion GUI for training our LORA. To install it, run the following sequence of commands when SSHed into your instance:
sudo apt install python-venv
sudo apt install -y python3-tk
git clone https://github.com/bmaltais/kohya_ss.git
cd kohya_ss
chmod +x ./setup.sh
python3.10-venv
./setup.sh
If at the end of the script a red warning asks you to configure accelerate run the commands below and choose the default option for each question:
source venv/bin/activate
cd venv/
./bin/accelerate config
You will now be able to run the training software by executing:
./gui.sh
Step 5 — Prepare Dataset
At this point, you should have the training software running and your images selected and resized.
In this step, we will label the dataset and then place inside the correct folder structure for training.
Transfer the dataset from your local computer to the EC2 server runinng the training software.
scp -r directory_containing_photos_resized medium-ai-playground:/home/ubuntu/directory_containing_photos_resized
Caption the photos in the utilities > Captioning > BLIP Captioning
option as shown below.
Create the directory structure for training in the LoRA > Training > Dataset Preparation > Dreambooth/LoRA Folder preparation
as shown below.
Select “Prepare training data” and upon completion “Copy info to Folders Tab”.
Step 6 — Train LoRA
The final step in creating our LoRA is to configure the training parameters.
Under LoRA > Training > Source model
choose the model you want to base your LoRA on as shown below:
Check that the Folders
tab is already filled out based on our previous step as shown below:
We can then configure the parameters for training as shown below. However, it is important to point out that the best parameters for a given dataset requires experimentation so consider these as a starting point.
You can then press the orange Start Training
button! Upon doing so, the terminal window where you ran the kohya_ss gui will log the process and a progress bar will appear.
Step 7 — Use your LoRA!
For testing the LoRA, transfer the models to your computer with the command below:
scp medium-ai-playground:/home/ubuntu/thepaulo_training/model/thepaulo-000003.safetensors ~/Downloads/thepaulo-000003.safetensors
Then place your model in the LoRA folder of Automatic1111 (or other inference tool) with a prompt that uses your model such as the example below:
Conclusion
We have gone over the process for creating a custom LoRA of a subject from a small collection of photos. This included processing the dataset, instantiating a cloud instance for training, configuring and training the model.
This baseline process should be able to yield reasonable results. However, for the best results here are a few ways to improve:
- Review all the CLIP generated captions and enhance them with more detail. Ideally everything that is not the subject should be described so that it is not included in the LoRA.
- Increase the number of images.
- Tune training parameters based on experimentation.
Need help setting this up or would like to talk to an AI expert?
Email us at hello@avantsoft.com.br 🚀
For some troubleshooting tips, check the section below.
Troubleshooting
python -m bitsandbytes: libcusparse.so.11: cannot open shared object file: No such file or directory
This issue can be resolved by running inside the kohya_ss
directory the following commands:
source venv/bin/activate
pip uninstall torchvision torch
pip install torchvision torch
xFormers wasn’t build with CUDA support
This issue can be resolved by running inside the kohya_ss
directory the following commands:
source venv/bin/activate
pip install xformers