A ComfyUI custom node implementation of OmniGen, a powerful all in one text-to-image generation and editing model.
- 2024/11/11: Added new features and bug fixes. Refer to update.md for details.
- Text-to-Image Generation
- Image Editing
- Support for Multiple Input Images
- Memory Optimization Options
- Flexible Image Size Control
- Search for
ComfyUI-OmniGen
on ComfyUI Manager and install it. - Restart ComfyUI
- Open the terminal on the ComfyUI
ComfyUI/custom_nodes
folder - Run
git clone https://github.com/1038lab/ComfyUI-OmniGen.git
- Restart ComfyUI
- Open the terminal on the
ComfyUI/custom_nodes/ComfyUI-OmniGen
folder ..\..\..\python_embeded\python.exe -m pip install -r requirements.txt
The node includes automatic downloading of:
- OmniGen code from GitHub repository
- Model weights from Hugging Face
No manual file downloading is required. The node will handle everything automatically on first use.
Important
The first time you use this custom node, it will automatically download the model from Hugging Face. Please be patient, as the download size is approximately 15.5 GB, and it may take some time to complete.
Alternatively, you can manually download the model from Hugging Face at the following link:
Download OmniGen-v1 from Hugging Face
After downloading, place the model in the following directory: comfyui/models/LLM/OmniGen-v1
Simple usage for text to image & image to image. workflow
Generate an image that combines 2 images. Workflow
Following the pose of this image image_1
, generate a new photo: An viking old man standing.
Generate a depth map from the input image and create a new image based on the depth map. Workflow
You can reference input images in your prompt using either format:
<img><|image_1|></img>
,<img><|image_2|></img>
,<img><|image_3|></img>
image_1
,image_2
,image_3
image1
,image2
,image3
The node will automatically download required files on first use:
- OmniGen code from GitHub
- Model weights from Hugging Face (Shitao/OmniGen-v1)
prompt
: Text description of the desired imagenum_inference_steps
: Number of denoising steps (default: 50)guidance_scale
: Text guidance scale (default: 2.5)img_guidance_scale
: Image guidance scale (default: 1.6)max_input_image_size
: Maximum size for input images (default: 1024)width/height
: Output image dimensions (default: 1024x1024)seed
: Random seed for reproducibility
separate_cfg_infer
: Separate inference process for different guidance (default: True)offload_model
: Offload model to CPU to reduce memory usage (default: True)use_input_image_size_as_output
: Match output size to input image (default: False)
- Model Weights: Shitao/OmniGen-v1