Skip to content

Commit

Permalink
Merge pull request #105 from invoke-ai/replace-pokemon-dataset
Browse files Browse the repository at this point in the history
Replace the lambdalabs/pokemon-blip-captions dataset
  • Loading branch information
RyanJDick authored Apr 12, 2024
2 parents cedefe8 + 5ed6f2d commit bd5da50
Show file tree
Hide file tree
Showing 24 changed files with 250 additions and 292 deletions.
5 changes: 3 additions & 2 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
output/
test_configs/
/output/
/test_configs/
/data/

# pyenv
.python-version
Expand Down
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ pip install -e ".[test]" --extra-index-url https://download.pytorch.org/whl/cu12

Run training via the CLI with type-checked YAML configuration files for maximum control:
```bash
invoke-train --cfg-file src/invoke_training/sample_configs/sd_lora_pokemon_1x8gb.yaml
invoke-train --cfg-file src/invoke_training/sample_configs/sdxl_textual_inversion_gnome_1x24gb.yaml
```

### GUI
Expand All @@ -63,7 +63,7 @@ Training progress can be monitored with [Tensorboard](https://www.tensorflow.org
All trained models are compatible with InvokeAI:

![Screenshot of the InvokeAI UI with an example of a Yoda pokemon generated using a Pokemon LoRA model.](docs/images/invokeai_yoda_pokemon_lora.png)
*Example image generated with the prompt "A cute yoda pokemon creature." and the trained Pokemon LoRA.*
*Example image generated with the prompt "A cute yoda pokemon creature." and a trained Pokemon LoRA.*

## Contributing

Expand Down
14 changes: 7 additions & 7 deletions docs/concepts/dataset_formats.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,21 +2,15 @@

`invoke-training` supports the following dataset formats:

- `HF_HUB_IMAGE_CAPTION_DATASET`: A Hugging Face Hub dataset containing images and captions.
- `IMAGE_CAPTION_JSONL_DATASET`: A local image-caption dataset described by a single `.jsonl` file.
- `IMAGE_CAPTION_DIR_DATASET`: A local directory of images with associated `.txt` caption files.
- `IMAGE_DIR_DATASET`: A local directory of images (without captions).
- `HF_HUB_IMAGE_CAPTION_DATASET`: A Hugging Face Hub dataset containing images and captions.

See the documentation for a particular training pipeline to see which dataset formats it supports.

The following sections explain each of these formats in more detail.

## `HF_HUB_IMAGE_CAPTION_DATASET`

Config documentation: [HFHubImageCaptionDatasetConfig][invoke_training.config.data.dataset_config.HFHubImageCaptionDatasetConfig]

The easiest way to get started with `invoke-training` is to use a publicly available dataset on [Hugging Face Hub](https://huggingface.co/datasets). You can filter for the `Text-to-Image` task to find relevant datasets that contain both an image column and a caption column. [lambdalabs/pokemon-blip-captions](https://huggingface.co/datasets/lambdalabs/pokemon-blip-captions) is a popular choice if you're not sure where to start.

## `IMAGE_CAPTION_JSONL_DATASET`

Config documentation: [ImageCaptionJsonlDatasetConfig][invoke_training.config.data.dataset_config.ImageCaptionJsonlDatasetConfig]
Expand Down Expand Up @@ -102,3 +96,9 @@ This dataset can be used with the following pipeline dataset configuration:
type: IMAGE_DIR_DATASET
dataset_dir: /path/to/my_custom_dataset
```

## `HF_HUB_IMAGE_CAPTION_DATASET`

Config documentation: [HFHubImageCaptionDatasetConfig][invoke_training.config.data.dataset_config.HFHubImageCaptionDatasetConfig]

The `HF_HUB_IMAGE_CAPTION_DATASET` dataset format can be used to access publicly datasets on the [Hugging Face Hub](https://huggingface.co/datasets). You can filter for the `Text-to-Image` task to find relevant datasets that contain both an image column and a caption column. [lambdalabs/pokemon-blip-captions](https://huggingface.co/datasets/lambdalabs/pokemon-blip-captions) is a popular choice if you're not sure where to start.
35 changes: 23 additions & 12 deletions docs/get-started/installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,19 +6,25 @@
2. An NVIDIA GPU with >= 8 GB VRAM is recommended for model training.

## Basic Installation

0. Open your terminal and navigate to the directory where you want to clone the `invoke-training` repo.
1. Clone the repo:
```bash
git clone https://github.com/invoke-ai/invoke-training.git
```
2. (*Optional, but highly recommended*) Create and activate a python [virtual environment](https://docs.python.org/3/library/venv.html#creating-virtual-environments). This creates an isolated environment for `invoke-training` and its dependencies that won't interfere with other python environments on your system, including any installations of the [local Invoke client](https://www.github.com/invoke-ai/invokeai).
2. Create and activate a python [virtual environment](https://docs.python.org/3/library/venv.html#creating-virtual-environments). This creates an isolated environment for `invoke-training` and its dependencies that won't interfere with other python environments on your system, including any installations of [InvokeAI](https://www.github.com/invoke-ai/invokeai).
```bash
# Create the new virtual environment in a memorable location by navigating to the folder and running this command
python -m venv invoketraining
# Navigate to the invoke-training directory.
cd invoke-training

# Activate the new virtual environment
Windows: .\invoketraining\Scripts\activate
Linux: source invoketraining/bin/activate
# Create a new virtual environment named `invoketraining`.
python -m venv invoketraining

# Activate the new virtual environment.
# On Windows:
.\invoketraining\Scripts\activate
# On MacOS / Linux:
source invoketraining/bin/activate
```
3. Install `invoke-training` and its dependencies:
```bash
Expand All @@ -30,17 +36,22 @@ pip install ".[test]" --extra-index-url https://download.pytorch.org/whl/cu121
```

## Developer Installation

1. Consider forking the repo if you plan to contribute code changes.
2. `git clone` the repo.
3. (*Optional, but highly recommended*) Create and activate a python [virtual environment](https://docs.python.org/3/library/venv.html#creating-virtual-environments). This creates an isolated environment for `invoke-training` and its dependencies that won't interfere with other python environments on your system, including any installations of the [local Invoke client](https://www.github.com/invoke-ai/invokeai).
3. Create and activate a python [virtual environment](https://docs.python.org/3/library/venv.html#creating-virtual-environments). This creates an isolated environment for `invoke-training` and its dependencies that won't interfere with other python environments on your system, including any installations of [InvokeAI](https://www.github.com/invoke-ai/invokeai).
```bash
# Create the new virtual environment in a memorable location by navigating to the folder and running this command
python -m venv invoketraining
# Navigate to the invoke-training directory.
cd invoke-training

# Activate the new virtual environment
Windows: .\invoketraining\Scripts\activate
Linux: source invoketraining/bin/activate
# Create a new virtual environment named `invoketraining`.
python -m venv invoketraining

# Activate the new virtual environment.
# On Windows:
.\invoketraining\Scripts\activate
# On MacOS / Linux:
source invoketraining/bin/activate
```
4. Install `invoke-training` and its dependencies:
```bash
Expand Down
50 changes: 0 additions & 50 deletions docs/get-started/quick-start-cli.md

This file was deleted.

Original file line number Diff line number Diff line change
@@ -1,11 +1,12 @@
# Quick Start - GUI
# Quick Start

This page walks through the steps to train your first model with the `invoke-training` GUI.
`invoke-training` has both a GUI and a CLI (for advanced users). The instructions for getting started with both options can be found on this page.

There is also a [Quick Start - CLI](./quick-start-cli.md) guide.
There is also a video introduction to `invoke-training`:

## Tutorial
<iframe width="560" height="315" src="https://www.youtube.com/embed/OZIz2vvtlM4?si=iR73F0IhlsolyYAl" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe>

## Quick Start - GUI
### 1. Installation
Follow the [`invoke-training` installation instructions](./installation.md).

Expand Down Expand Up @@ -53,6 +54,10 @@ You can now use your trained Pokemon LoRA in the InvokeAI UI! 🎉
![Screenshot of the InvokeAI UI with an example of a Yoda pokemon generated using a Pokemon LoRA model.](../images/invokeai_yoda_pokemon_lora.png)
*Example image generated with the prompt "A cute yoda pokemon creature." and Pokemon LoRA.*

## Next Steps

After completing this Quick Start tutorial, we recommend continuing with any of the [full training pipeline tutorials](../tutorials/index.md).
## Quick Start - CLI
### 1. Installation
Follow the [`invoke-training` installation instructions](./installation.md).

### 2. Training
See the [Textual Inversion - SDXL](../tutorials/stable_diffusion/textual_inversion_sdxl.md) tutorial for instructions on how to train a model via the CLI.
3 changes: 1 addition & 2 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -27,8 +27,7 @@ nav:
- Welcome: index.md
- Get Started:
- get-started/installation.md
- get-started/quick-start-cli.md
- get-started/quick-start-gui.md
- get-started/quick-start.md
- Tutorials:
- tutorials/index.md
- Stable Diffusion:
Expand Down
56 changes: 56 additions & 0 deletions src/invoke_training/sample_configs/sd_lora_baroque_1x8gb.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
# Training mode: Finetuning with LoRA
# Base model: SD 1.5
# Dataset: https://huggingface.co/datasets/InvokeAI/nga-baroque
# GPU: 1 x 8GB

# Instructions:
# 1. Download the dataset from https://huggingface.co/datasets/InvokeAI/nga-baroque.
# 2. Update the `jsonl_path` field in the `data_loader` section to point to the `metadata.jsonl` file of the downloaded
# dataset.

# Notes:
# This config file has been optimized for the primary goal of achieving reasonable results *quickly* for demo purposes.

type: SD_LORA
seed: 1
base_output_dir: output/baroque/sd_lora

optimizer:
optimizer_type: Prodigy
learning_rate: 1.0
weight_decay: 0.01
use_bias_correction: True
safeguard_warmup: True

data_loader:
type: IMAGE_CAPTION_SD_DATA_LOADER
dataset:
type: IMAGE_CAPTION_JSONL_DATASET
# Update the jsonl_path field to point to the metadata.jsonl file of the downloaded dataset.
jsonl_path: data/nga-baroque/metadata.jsonl
resolution: 512
aspect_ratio_buckets:
target_resolution: 512
start_dim: 256
end_dim: 768
divisible_by: 64
caption_prefix: "A baroque painting of"
dataloader_num_workers: 4

# General
model: runwayml/stable-diffusion-v1-5
gradient_accumulation_steps: 1
mixed_precision: fp16
xformers: False
gradient_checkpointing: True

max_train_epochs: 15
save_every_n_epochs: 1
validate_every_n_epochs: 1

max_checkpoints: 5
validation_prompts:
- A baroque painting of a woman carrying a basket of fruit.
- A baroque painting of a cute Yoda creature.
train_batch_size: 4
num_validation_images_per_prompt: 3
45 changes: 0 additions & 45 deletions src/invoke_training/sample_configs/sd_lora_pokemon_1x8gb.yaml

This file was deleted.

This file was deleted.

Loading

0 comments on commit bd5da50

Please sign in to comment.