MIMO-unofficial

Unofficial implementation of MIMO (MImicking anyone anywhere with complex Motions and Object interactions)

My blog post: Original paper:

🎯 Overview

This repository offers a comprehensive pipeline for training and inference to transform character appearances and motions in videos. As part of the video-to-video generation category, this framework enables dynamic character modification with optional inputs, including an avatar photo and/or 3D animations.

demo.webm

⚒️ Installation and Setup

Tests were made using:

Cuda 12.2
Python 3.10.12
torch 2.4.1

Clone repository and Install requirements

git clone [email protected]:antoinedelplace/MIMO-unofficial.git
cd MIMO-unofficial/
python3 -m venv venv
source venv/bin/activate
pip install torch torchvision torchaudio
pip install -r requirements.txt

Folder architecture

See configs/paths.py

├── code
│   ├── MIMO-unofficial
│   ├── AnimateAnyone
│   ├── Depth-Anything-V2
│   ├── 4D-Humans
│   ├── PHALP
│   ├── detectron2
│   ├── sam2
│   ├── nvdiffrast
│   └── ProPainter
├── checkpoints
└── data

Clone external repositories and Install requirements

Download checkpoints

Here are the checkpoints to download:

Some checkpoints are automatically downloaded from 🤗 Hugging Face but require manual acceptance of the terms and conditions. You can accept these terms with your 🤗 Hugging Face account, then log in to the server using huggingface-cli login. Here are the gated models:

black-forest-labs/FLUX.1-dev

Extra steps

For mimo/dataset_preprocessing/pose_estimation_4DH.py
- Download SMPL model and put it in MIMO-unofficial and in MIMO-unofficial/data basicModel_neutral_lbs_10_207_0_v1.0.0.pkl
- Uninstall phalp and hmr2 packages to use the clone repositoy instead
```
pip uninstall phalp hmr2
```
- Renderer needs to be removed to avoid OpenGL errors in 4D-Humans/hmr2/models/__init__.py line 84:
```
model = HMR2.load_from_checkpoint(checkpoint_path, strict=False, cfg=model_cfg, init_renderer=False)
```
- Remove automatic saving files to speed up inference in PHALP/phalp/trackers/PHALP.py line 264: Remove joblib.dump(final_visuals_dic, pkl_path, compress=3)

For mimo/dataset_preprocessing/get_apose_ref.py

If you need DWPose to extract 2D pose from an image

pip install -U openmim
mim install mmengine
mim install "mmcv>=2.0.1"
mim install "mmdet>=3.1.0"
mim install "mmpose>=1.1.0"

in AnimateAnyone/src/models/unet_2d_blocks.py line 9:

from diffusers.models.transformers.dual_transformer_2d import DualTransformer2DModel

For mimo/training/main.py

in AnimateAnyone/src/models/mutual_self_attention.py line 48:

    self.register_reference_hooks(
        mode,
        do_classifier_free_guidance,
        attention_auto_machine_weight,
        gn_auto_machine_weight,
        style_fidelity,
        reference_attn,
        reference_adain,
        batch_size=batch_size,
        fusion_blocks=fusion_blocks,
    )

in AnimateAnyone/src/models/unet_2d_blocks.py line 9:

from diffusers.models.transformers.dual_transformer_2d import DualTransformer2DModel

🚀 Run scripts

Dataset Preprocessing

python mimo/dataset_preprocessing/video_sampling_resizing.py
python mimo/dataset_preprocessing/remove_duplicate_videos.py
python mimo/dataset_preprocessing/human_detection_detectron2.py
python mimo/dataset_preprocessing/depth_estimation.py
python mimo/dataset_preprocessing/video_tracking_sam2.py
python mimo/dataset_preprocessing/video_inpainting.py
python mimo/dataset_preprocessing/get_apose_ref.py
python mimo/dataset_preprocessing/upscale_apose_ref.py
python mimo/dataset_preprocessing/vae_encoding.py
python mimo/dataset_preprocessing/clip_embedding.py
python mimo/dataset_preprocessing/pose_estimation_4DH.py
python mimo/dataset_preprocessing/rasterizer_2d_joints.py

Inference

accelerate config
    - No distributed training
    - numa efficiency
    - fp16

accelerate launch mimo/inference/main.py -i input_video.mp4

Training

accelerate config
   - multi-GPU
   - numa efficiency
   - fp16

accelerate launch mimo/training/main.py -c 1540 -t ./mimo/configs/training/cfg_phase1.yaml
accelerate launch mimo/training/main.py -c 1540 -t ./mimo/configs/training/cfg_phase2.yaml

🙏🏻 Acknowledgements

This project is based on novitalabs/AnimateAnyone and MooreThreads/Moore-AnimateAnyone which is licensed under the Apache License 2.0. We thank to the authors of MIMO, novitalabs/AnimateAnyone and MooreThreads/Moore-AnimateAnyone, for their open research and exploration.

Name		Name	Last commit message	Last commit date
Latest commit History 168 Commits
assets		assets
mimo		mimo
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MIMO-unofficial

🎯 Overview

⚒️ Installation and Setup

Clone repository and Install requirements

Folder architecture

Clone external repositories and Install requirements

Download checkpoints

Extra steps

🚀 Run scripts

Dataset Preprocessing

Inference

Training

🙏🏻 Acknowledgements

About

Releases

Packages

Languages

License

antoinedelplace/MIMO-unofficial

Folders and files

Latest commit

History

Repository files navigation

MIMO-unofficial

🎯 Overview

⚒️ Installation and Setup

Clone repository and Install requirements

Folder architecture

Clone external repositories and Install requirements

Download checkpoints

Extra steps

🚀 Run scripts

Dataset Preprocessing

Inference

Training

🙏🏻 Acknowledgements

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages