Skip to content

Zielon/INSTA

Repository files navigation

INSTA - Instant Volumetric Head Avatars

Max Planck Institute for Intelligent Systems, Tübingen, Germany

Official Repository for CVPR 2023 paper Instant Volumetric Head Avatars

This repository is based on instant-ngp, some of the features of the original code are not available in this work.

⚠ We also prepared a Pytorch demo version of the project INSTA Pytorch 

Installation

The repository is based on instant-ngp commit. The requirements for the installation are the same, therefore please follow the guide. Remember to use the --recursive option during cloning.

git clone --recursive https://github.com/Zielon/INSTA.git
cd INSTA
cmake . -B build
cmake --build build --config RelWithDebInfo -j

Usage and Requirements

After building the project you can either start training an avatar from scratch or load a snapshot. For training, we recommend a graphics card higher or equal to RTX3090 24GB and 32 GB of RAM memory. Training on a different hardware probably requires adjusting options in the config:

  "max_cached_bvh": 4000,            # How many BVH data structures are cached
  "max_images_gpu": 1700,            # How many frames are loaded to GPU. Adjust for a given GPU memory size.
  "use_dataset_cache": true,         # Load images to RAM memory
  "max_steps": 33000,                # Max training steps after which test sequence will be recorded
  "render_novel_trajectory": false,  # Dumps additional camera trajectories after max steps
  "render_from_snapshot": false      # For --no-gui option to directly render sequences

Rendering from a snapshot does not require a high-end GPU and can be performed even on a laptop. We have tested it on RTX 3080 8GB laptop version. For --no-gui option you can train and load snapshot for rendering by using the config in the same way as the one with GUI. The viewer options are the same as in the case of instant-ngp, with some additional key F to raycast the FLAME mesh.

Usage example

# Training
./build/rta --config insta.json --scene data/obama --height 512 --width 512

# Loading from a checkpoint
./build/rta --config insta.json --scene data/obama/transforms_test.json --snapshot data/obama/snapshot.msgpack

Dataset and Training

We are releasing part of our dataset together with publicly available preprocessed avatars from NHA, NeRFace and IMAvatar. The output of the training (Record Video in menu), including rendered frames, checkpoint, etc will be saved in the ./data/{actor}/experiments/{config}/debug. After the specified number of max steps, the program will automatically either render frames using novel cameras (All option in GUI and render_novel_trajectory in config) or only the currently selected one in Mode, by default Overlay\Test.

Available avatars. Click the selected avatar to download the training dataset and the checkpoint. The avatars have to be placed in the data folder.

Dataset Generation

For the input generation, a conda environment is needed, and a few other repositories. Simply run install.sh from scripts folder to prepare the workbench.

Next, you can use Metrical Photometric Tracker for the tracking of a sequence. After the processing is done run the generate.sh script to prepare the sequence. As input please specify the absolute path of the tracker output.

For training we recommend at least 1000 frames.

# 1) Run the Metrical Photometric Tracker for a selected actor
python tracker.py --cfg ./configs/actors/duda.yml

# 2) Generate a dataset using the script. Importantly, use the absolute path to tracker input and desired output.
./generate.sh /metrical-tracker/output/duda INSTA/data/duda 100
#                        {input}                {output}    {# of test frames from the end}

Citation

If you use this project in your research please cite INSTA:

@proceedings{INSTA:CVPR2023,
  author = {Zielonka, Wojciech and Bolkart, Timo and Thies, Justus},
  title = {Instant Volumetric Head Avatars},
  journal = {Conference on Computer Vision and Pattern Recognition},
  year = {2023}
}