Skip to content

SCEPTER is an open-source framework used for training, fine-tuning, and inference with generative models.

License

Notifications You must be signed in to change notification settings

modelscope/scepter

Repository files navigation

🪄SCEPTER

🪄SCEPTER is an open-source code repository dedicated to generative training, fine-tuning, and inference, encompassing a suite of downstream tasks such as image generation, transfer, editing. SCEPTER integrates popular community-driven implementations as well as proprietary methods by Tongyi Lab of Alibaba Group, offering a comprehensive toolkit for researchers and practitioners in the field of AIGC. This versatile library is designed to facilitate innovation and accelerate development in the rapidly evolving domain of generative models.

SCEPTER offers 3 core components:

🎉 News

  • [🔥🔥🔥2024.11]: We're excited to announce the upcoming release of the ACE-0.6b-1024px model, which significantly enhances image generation quality compared with ACE-0.6b-512px. The detailed documents can be found at ACE repo. At the same time, based on the editing results of ACE, combined with the powerful text-to-image capabilities of the FLUX-dev model through SDEdit as an image quality refiner, the quality of image editing can be further enhanced.
  • [🔥2024.11]: Supports video files, video annotation, caption translation in data management, and inference & training of the CogVideoX.
  • [2024.10]: We are pleased to announce the release of the code for ACE, supporting Customized Training / Comfy UI Workflow / gradio-based ChatBot Interface.
  • [2024.10]: Support for inference and tuning with FLUX, as well as for building ComfyUI workflows using this framework.
  • [2024.09]: We introduce ACE, an All-round Creator and Editor adept at executing a diverse array of image editing tasks tailored to your specifications. Built upon the cutting-edge Diffusion Transformer architecture, ACE has been extensively trained on a comprehensive dataset to seamlessly interpret and execute any natural language instruction. For further information, please consult the project page.
  • [2024.07]: Support the inference and training of open-source generative models based on the DiT architecture, such as SD3 and PixArt.
  • [2024.05]: Introducing SCEPTER v1, supporting customized image edit tasks! Simply provide 10 image pairs, SCEPTER will tune an edit tuner for your own Image-to-Image tasks, like Clay Style, De-Text, Segmentation, etc.
  • [2024.04]: New StyleBooth demo on SCEPTER Studio forText-Based Style Editing.
  • [2024.03]: We optimize the training UI and checkpoint management. New LAR-Gen model has been added on SCEPTER Studio, supporting zoom-out, virtual try on, inpainting.
  • [2024.02]: We release new SCEdit controllable image synthesis models for SD v2.1 and SD XL. Multiple strategies applied to accelerate inference time for SCEPTER Studio.
  • [2024.01]: We release SCEPTER Studio, an integrated toolkit for data management, model training and inference based on Gradio.
  • [2024.01]: SCEdit support controllable image synthesis for training and inference.
  • [2023.12]: We propose SCEdit, an efficient and controllable generation framework.
  • [2023.12]: We release 🪄SCEPTER library.

🪄ACE

ACE is a unified foundational model framework that supports a wide range of visual generation tasks. By defining CU for unifying multi-modal inputs across different tasks and incorporating long-context CU, we introduce historical contextual information into visual generation tasks, paving the way for ChatGPT-like dialog systems in visual generation.

Watch the demo

ACE Models

Model Status
ACE-0.6B-512px Demo link
ModelScope link HuggingFace link
ACE-0.6B-1024px Demo link
ModelScope link HuggingFace link
ACE-12B-FLUX-dev Coming Soon

ACE Training

We offer a demonstration training YAML that enables the end-to-end training of ACE using a toy dataset. For a comprehensive overview of the hyperparameter configurations, please consult scepter/methods/edit/dit_ace_0.6b_512.yaml.

Prepare datasets

Please find the dataset class located in scepter/modules/data/dataset/ms_dataset.py, designed to facilitate end-to-end training using an open-source toy dataset. Download a dataset zip file from modelscope, and then extract its contents into the cache/datasets/ directory.

Should you wish to prepare your own datasets, we recommend consulting scepter/modules/data/dataset/ms_dataset.py for detailed guidance on the required data format.

Prepare initial weight

The ACE checkpoint has been uploaded to both ModelScope and HuggingFace platforms:

In the provided training YAML configuration, we have designated the Modelscope URL as the default checkpoint URL. Should you wish to transition to Hugging Face, you can effortlessly achieve this by modifying the PRETRAINED_MODEL value within the YAML file (replace the prefix "ms://iic" to "hf://scepter-studio").

Start training

You can easily start training procedure by executing the following command:

# ACE-0.6B-512px
PYTHONPATH=. python scepter/tools/run_train.py --cfg scepter/methods/edit/dit_ace_0.6b_512.yaml
# ACE-0.6B-1024px
PYTHONPATH=. python scepter/tools/run_train.py --cfg scepter/methods/edit/dit_ace_0.6b_1024.yaml

ACE Chat Bot

We have developed a chatbot interface utilizing Gradio, designed to convert user input in natural language into visually captivating images that align semantically with the specified instructions. You can easily access this functionality by launching Scepter Studio with the following command:

PYTHONPATH=. python scepter/tools/webui.py --cfg scepter/methods/studio/scepter_ui.yaml --language zh --tab chatbot

Upon starting, you will find a "ChatBot" tab within the Gradio application, which serves as a chat-based interface to handle any requests related to image editing or generation.

ACE ComfyUI Workflow

Workflow

ACE Workflow Examples
Control Semantic Element

🖼 Gallery for Recent Works

FLUX Tuners

Yarn Style Soft Watercolor Style
Travel Style WuKong Style

ComfyUI Workflow

Workflow

Example Workflow Case
Base +Mantra +Tuner +Control

🛠️ Installation

  • Create new environment with conda command:
conda env create -f environment.yaml
conda activate scepter
  • Install with pip command:

We recommend installing the specific version of PyTorch and accelerate toolbox xFormers. You can install these recommended version by pip:

pip install -r requirements/recommended.txt
pip install scepter

🧩 Generative Framework

Tutorials

Documentation Key Features
Train DDP / FSDP / FairScale / Xformers
Inference Dynamic load/unload
Dataset Management Local / Http / OSS / Modelscope

📝 Popular Approaches

Currently supported approaches

Tasks Methods Links
Text-to-image Generation SD v1.5 Hugging Face Repo
Text-to-image Generation SD v2.1 Hugging Face Repo
Text-to-image Generation SD-XL Hugging Face Repo
Text-to-image Generation FLUX Hugging Face Repo
Efficient Tuning LoRA Arxiv   link
Efficient Tuning Res-Tuning(NeurIPS23) Arxiv   link Page link
Controllable Image Synthesis 🌟SCEdit(CVPR24) Arxiv   link Page link
Image Editing 🌟LAR-Gen Arxiv   link Page link
Image Editing 🌟StyleBooth Arxiv   link Page link
Image Generation and Editing 🌟ACE Arxiv   link Page link Demo link
ModelScope link HuggingFace link

🖥️ SCEPTER Studio

Launch

To fully experience SCEPTER Studio, you can launch the following command line:

pip install scepter
python -m scepter.tools.webui

or run after clone repo code

git clone https://github.com/modelscope/scepter.git
PYTHONPATH=. python scepter/tools/webui.py --cfg scepter/methods/studio/scepter_ui.yaml

The startup of SCEPTER Studio eliminates the need for manual downloading and organizing of models; it will automatically load the corresponding models and store them in a local directory. Depending on the network and hardware situation, the initial startup usually requires 15-60 minutes, primarily involving the download and processing of SDv1.5, SDv2.1, and SDXL models. Therefore, subsequent startups will become much faster (about one minute) as downloading is no longer required.

Usage Demo

Image Editing Training Model Sharing Model Inference Data Management

Modelscope Studio & Huggingface Space

We deploy a work studio on Modelscope that includes only the inference tab, please refer to ms_scepter_studio and hf_scepter_studio

⚙️️ ComfyUI Workflow

We support the use of all models in the ComfyUI Workflow through the following methods:

  1. Automatic installation directly via the ComfyUI Manager by searching for the ComfyUI-Scepter node.
  2. Manually install by moving custom_nodes from Scepter to ComfyUI.
git clone https://github.com/modelscope/scepter.git
cd path/to/scepter
pip install -e .
cp -r path/to/scepter/workflow/ path/to/ComfyUI/custom_nodes/ComfyUI-Scepter
cd path/to/ComfyUI
python main.py

Note: You can use the nodes by dragging the sample images into ComfyUI. Additionally, our nodes can automatically pull models from ModelScope or HuggingFace by selecting the model_source field, or you can place the already downloaded models in a local path.

🔍 Learn More

  • Alibaba TongYi Vision Intelligence Lab

    Discover more about open-source projects on image generation, video generation, and editing tasks.

  • ModelScope library

    ModelScope Library is the model library of ModelScope project, which contains a large number of popular models.

  • SWIFT library

    SWIFT (Scalable lightWeight Infrastructure for Fine-Tuning) is an extensible framwork designed to faciliate lightweight model fine-tuning and inference.

BibTeX

If our work is useful for your research, please consider citing:

@misc{scepter,
    title = {SCEPTER, https://github.com/modelscope/scepter},
    author = {SCEPTER},
    year = {2023}
}

License

This project is licensed under the Apache License (Version 2.0).

Acknowledgement

Thanks to Stability-AI, SWIFT library, Fooocus and ComfyUI for their awesome work.