diff --git a/index.html b/index.html index a8c0110..3745c3b 100644 --- a/index.html +++ b/index.html @@ -231,7 +231,8 @@

the text prompt, thus simplifying the learning of mapping from embeddings to image outputs. Finally, to align the pre-trained Stable Diffusion model (1.4) with the embeddings of our modular encoder, we retrain the conditioning by finetuning the cross-attention weights (2.2).

- method
+ src/imgs/architecture.png +