Releases: XPixelGroup/DiffBIR
Releases · XPixelGroup/DiffBIR
DiffBIR v2.1
News 📰
- A new model trained on the unsplash dataset with captions generated by LLaVA v1.5. It's based on SD2.1-zsnr. It offers better reconstruction quality and a greater understanding of prompts. Please follow README to have a try 😄!
- New samplers have been added, allowing results to be achieved in just 10 steps. The new samplers include ddim, dpm-solver, and various edm samplers from k-diffusion.
- Two captioners are available to automatically generate prompts: LLaVA and RAM.
- Noise augmentation has been added. Now you can enhance the model's creativity by adding noise to conditions.
- Supports fp16/bf16 inference and batch processing.
- Supports tiled inference for the stage-1 model. Fixes the issue of brightness changes in naively-implemented Tiled VAE by integrating with Tiled VAE.
- A Gradio demo is provided, allowing you to quickly experience all the updates mentioned above!
- Fixes a bug in last version for '--better_start' option. Input image for cldm.vae_encode should range in [-1, 1], rather than [0, 1].
With tiled inference, DiffBIR v2.1 can run on graphics cards with 8GB of VRAM. If you encounter any issues, feel free to open an issue. We hope you enjoy using it :D
TODO
- Supports Multi-GPU inference.
- A more stable online application.
- ComfyUI, replicate, diffusers, etc..
- Continue to optimize the model's performance while keeping the model architecture unchanged.
DiffBIR v2
News About Updated Manuscript 📖
- Rename LAControlNet (a little confusing😆) to IRControlNet.
- We train IRControlNet on our filtered laion2b-en dataset, which contains around 15M high-quality images. The pretrained weight is available now.
- We compare IRControlNet with 6 model variants and find that IRControlNet is good enough as a backbone for generation module.
- We support three BIR tasks: BSR, BFR and BID (Blind Image Denoising), while different tasks share the same IRControlNet. Visual examples can be found here.
- During inference, we directly use off-the-shelf restoration models for degradation removal. More details can be found here.
- We propose region-aware restoration guidance to better achieve a trade-off between quality and fidelity.
News About Code Base 👨💻
keep it simple and stupid.
- Free from pytorch lightning and LDM code base. Now the code has been rearranged as simple as possible.
- Lightning modules have been deleted.
- Put all model-related code (UNet, VAE, CLIP, etc.) to a single directory.
- Provide two minimal training scripts for training stage1 and stage2 model, built upon accelerate with the simplest training-loop style.
- Upgrade pytorch to 2.2.2 for 1) built-in sdp attention 2) torch.compile.
- Copy the clip-related code from open-clip. Now chinese users can free from the warning of failed connection with Hugging Face.
- Only save the paramters of IRControlNet, which reduces the size of checkpoint from 9GB to 1GB.