MIMO-unofficial

Unofficial implementation of MIMO (MImicking anyone anywhere with complex Motions and Object interactions)

My blog post: Original paper:

🎯 Overview

This repository offers a comprehensive pipeline for training and inference to transform character appearances and motions in videos. As part of the video-to-video generation category, this framework enables dynamic character modification with optional inputs, including an avatar photo and/or 3D animations.

demo.webm

⚒️ Installation and Setup

Tests were made using:

Cuda 12.2
Python 3.10.12
torch 2.4.1

Clone repository and Install requirements

git clone git@github.com:antoinedelplace/MIMO-unofficial.git
cd MIMO-unofficial/
python3 -m venv venv
source venv/bin/activate
pip install torch torchvision torchaudio
pip install -r requirements.txt

Folder architecture

See configs/paths.py

├── code
│   ├── MIMO-unofficial
│   ├── AnimateAnyone
│   ├── Depth-Anything-V2
│   ├── 4D-Humans
│   ├── PHALP
│   ├── detectron2
│   ├── sam2
│   ├── nvdiffrast
│   └── ProPainter
├── checkpoints
└── data

Clone external repositories and Install requirements

AnimateAnyone
Depth-Anything-V2
4D-Humans
PHALP
detectron2
sam2
nvdiffrast
ProPainter

Download checkpoints

Here are the checkpoints to download:

depth_anything_v2_vitl.pth
sam2.1_hiera_large.pt

Some checkpoints are automatically downloaded from 🤗 Hugging Face but require manual acceptance of the terms and conditions. You can accept these terms with your 🤗 Hugging Face account, then log in to the server using huggingface-cli login. Here are the gated models:

black-forest-labs/FLUX.1-dev

Extra steps

For mimo/dataset_preprocessing/pose_estimation_4DH.py
- Download SMPL model and put it in MIMO-unofficial and in MIMO-unofficial/data basicModel_neutral_lbs_10_207_0_v1.0.0.pkl
- Uninstall phalp and hmr2 packages to use the clone repositoy instead
```
pip uninstall phalp hmr2
```
- Renderer needs to be removed to avoid OpenGL errors in 4D-Humans/hmr2/models/__init__.py line 84:
```
model = HMR2.load_from_checkpoint(checkpoint_path, strict=False, cfg=model_cfg, init_renderer=False)
```
- Remove automatic saving files to speed up inference in PHALP/phalp/trackers/PHALP.py line 264: Remove joblib.dump(final_visuals_dic, pkl_path, compress=3)

For mimo/dataset_preprocessing/get_apose_ref.py

If you need DWPose to extract 2D pose from an image

pip install -U openmim
mim install mmengine
mim install "mmcv>=2.0.1"
mim install "mmdet>=3.1.0"
mim install "mmpose>=1.1.0"

in AnimateAnyone/src/models/unet_2d_blocks.py line 9:

from diffusers.models.transformers.dual_transformer_2d import DualTransformer2DModel

For mimo/training/main.py

in AnimateAnyone/src/models/mutual_self_attention.py line 48:

    self.register_reference_hooks(
        mode,
        do_classifier_free_guidance,
        attention_auto_machine_weight,
        gn_auto_machine_weight,
        style_fidelity,
        reference_attn,
        reference_adain,
        batch_size=batch_size,
        fusion_blocks=fusion_blocks,
    )

in AnimateAnyone/src/models/unet_2d_blocks.py line 9:

from diffusers.models.transformers.dual_transformer_2d import DualTransformer2DModel

🚀 Run scripts

Dataset Preprocessing

python mimo/dataset_preprocessing/video_sampling_resizing.py
python mimo/dataset_preprocessing/remove_duplicate_videos.py
python mimo/dataset_preprocessing/human_detection_detectron2.py
python mimo/dataset_preprocessing/depth_estimation.py
python mimo/dataset_preprocessing/video_tracking_sam2.py
python mimo/dataset_preprocessing/video_inpainting.py
python mimo/dataset_preprocessing/get_apose_ref.py
python mimo/dataset_preprocessing/upscale_apose_ref.py
python mimo/dataset_preprocessing/vae_encoding.py
python mimo/dataset_preprocessing/clip_embedding.py
python mimo/dataset_preprocessing/pose_estimation_4DH.py
python mimo/dataset_preprocessing/rasterizer_2d_joints.py

Inference

accelerate config
    - No distributed training
    - numa efficiency
    - fp16

accelerate launch mimo/inference/main.py -i input_video.mp4

Training

accelerate config
   - multi-GPU
   - numa efficiency
   - fp16

accelerate launch mimo/training/main.py -c 1540 -t ./mimo/configs/training/cfg_phase1.yaml
accelerate launch mimo/training/main.py -c 1540 -t ./mimo/configs/training/cfg_phase2.yaml

🙏🏻 Acknowledgements

This project is based on novitalabs/AnimateAnyone and MooreThreads/Moore-AnimateAnyone which is licensed under the Apache License 2.0. We thank to the authors of MIMO, novitalabs/AnimateAnyone and MooreThreads/Moore-AnimateAnyone, for their open research and exploration.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

MIMO-unofficial

🎯 Overview

⚒️ Installation and Setup

Clone repository and Install requirements

Folder architecture

Clone external repositories and Install requirements

Download checkpoints

Extra steps

🚀 Run scripts

Dataset Preprocessing

Inference

Training

🙏🏻 Acknowledgements

Files

README.md

Latest commit

History

README.md

File metadata and controls

MIMO-unofficial

🎯 Overview

⚒️ Installation and Setup

Clone repository and Install requirements

Folder architecture

Clone external repositories and Install requirements

Download checkpoints

Extra steps

🚀 Run scripts

Dataset Preprocessing

Inference

Training

🙏🏻 Acknowledgements