[arXiv
]
[Project Page
]
[BibTex
]
Code release for the CVPR 2023 paper "SDFusion: Multimodal 3D Shape Completion, Reconstruction, and Generation".
SDFusion is a diffusion-based 3D shape generator. It enables various applications. (left) SDFusion can generate 3D shapes conditioned on different input modalities, including partial shapes, images, and text. SDFusion can even jointly handle multiple conditioning modalities while controlling the strength for each of them. (right) We showcase an application where we leverage pretrained 2D models to texture 3D shapes generated by SDFusion.
We also use a 3D-printer to print out the generated shapes of SDFusion.
3d_print.mp4
- Connect to
GT VPN
and use SSH to log into the cluster:
-
Load
anaconda3/2022.05
Module in Phoenix following the instruction (https://docs.pace.gatech.edu/software/anacondaEnv/). -
Install the required Python packages in conda.
conda create -n sdfusion python=3.9 -y && conda activate sdfusion
conda install pytorch=1.13.0 torchvision pytorch-cuda=11.6 -c pytorch -c nvidia
conda install -c fvcore -c iopath -c conda-forge fvcore iopath
conda install -c bottler nvidiacub
conda install pytorch3d -c pytorch3d
pip install h5py joblib termcolor scipy einops tqdm matplotlib opencv-python PyMCubes imageio==2.19 trimesh omegaconf tensorboard notebook kornia ftfy regex transformers
First create a foler to save the pre-trained weights. Here we assume the folder is ./saved_ckpt
.
Then download the pre-trained weights from the provided links and put them in the ./saved_ckpt
folder.
mkdir saved_ckpt
# VQVAE's checkpoint
wget https://uofi.box.com/shared/static/zdb9pm9wmxaupzclc7m8gzluj20ja0b6.pth -O saved_ckpt/vqvae-snet-all.pth
# SDFusion
wget https://uofi.box.com/shared/static/ueo01ctnlzobp2dmvd8iexy1bdsquuc1.pth -O saved_ckpt/sdfusion-snet-all.pth
# SDFusion: single-view reconstruction (img2shape)
wget https://uofi.box.com/shared/static/01hnf7pbewft4115qkvv9zhh22v4d8ma.pth -O saved_ckpt/sdfusion-img2shape.pth
# SDFusion: text-guided shape generation (txt2shape)
wget https://uofi.box.com/shared/static/vyqs6aex3rwbgxweyl3qh21c8p6vu33f.pth -O saved_ckpt/sdfusion-txt2shape.pth
# SDFusion: multi-modal conditional shape generation (partial shape + [ img {and/or} txt] -> shape)
wget https://uofi.box.com/shared/static/d95l3465arc0ffley5vwmz8bscaubmhc.pth -O saved_ckpt/sdfusion-mm2shape.pth
Use Open OnDemand
to open jupyter notebook.
Then, open one of the following notebooks for the task you want to perform.
- Unconditional generation and shape completion:
demo_uncond_shape_comp.ipynb
- Single-view reconstruction (img2shape):
demo_img2shape.ipynb
- Text-guided shape generation (txt2shape):
demo_txt2shape.ipynb
- Multi-modal conditional shape generation (partial shape + [ img | txt ] ):
demo_mm2shape.ipynb
- (coming soon!) Text-guided Texturization:
demo_txt2tex.ipynb
Note that the notebooks will automatically save the generated shapes in the ./demo_results
folder.
The following are commands related to batch jobs. For more details, please refer to https://docs.pace.gatech.edu/phoenix_cluster/slurm_guide_phnx.
- submit job:
sbatch script.sbatch
- check job status:
squeue --job <jobID>
- delete job:
scancel <jobID>
An example of the script is in script/gpu.sbatch
. The script uses a V100 GPU and sdfusion_cu116
conda environment to perform text-guided shape generation.
-
First, depending on your OS, you might need to install the required packages/binaries via
brew
orapt-get
for computing the SDF given a mesh. If you cannot run the preprocessing files, please ctrl+c & ctrl+v the error message and search it on Google (usually there will be a one-line solution), or open an issue on this repo. We will try to update the README with the reported issues and their solutions under the Issues and FAQ section. -
ShapeNet
- Download the ShapeNetV1 dataset from the official website. Then, extract the downloaded file and put the extracted folder in the
./data
folder. Here we assume the extracted folder is at./data/ShapeNet/ShapeNetCore.v1
. - Run the following command for preprocessing the SDF from mesh.
- Download the ShapeNetV1 dataset from the official website. Then, extract the downloaded file and put the extracted folder in the
mkdir -p data/ShapeNet && cd data/ShapeNet
wget [url for downloading ShapeNetV1]
unzip ShapeNetCore.v1.zip
./launchers/unzip_snet_zipfiles.sh # unzip the zip files
cd preprocess
./launchers/launch_create_sdf_shapenet.sh
- BuildingNet
- Download the BuildingNet dataset from the official website. After you fill out the form, please download the v0 version of the dataset and uncompress it under
./data
. Here we assume the extracted folder is./data/BuildingNet_dataset_v0_1
. - Run the following command for preprocessing the SDF from mesh.·
- Download the BuildingNet dataset from the official website. After you fill out the form, please download the v0 version of the dataset and uncompress it under
cd preprocess
./launchers/launch_create_sdf_building.sh
cd ../
- Pix3D
- First download the Pix3D dataset from the official website:
wget http://pix3d.csail.mit.edu/data/pix3d.zip -P data
cd data
unzip pix3d.zip
cd ../
- Then, run the following command for preprocessing the SDF from mesh.
cd preprocess
./launchers/launch_create_sdf_pix3d.sh
cd ../
- ShapeNetRendering
- Run the following command for getting the rendering images, which is provided by the 3D-R2N2 paper.
wget ftp://cs.stanford.edu/cs/cvgl/ShapeNetRendering.tgz -P data/ShapeNet
cd data/ShapeNet && tar -xvf ShapeNetRendering.tgz
cd ../../
- text2shape
- Run the following command for setting up the text2shape dataset.
mkdir -p data/ShapeNet/text2shape
wget http://text2shape.stanford.edu/dataset/captions.tablechair.csv -P data/ShapeNet/text2shape
cd preprocess
./launchers/create_snet-text_splits.sh
- Train VQVAE
# ShapeNet
./launchers/train_vqvae_snet.sh
# BuildingNet
./launchers/train_vqvae-bnet.sh
After training, copy the trained VQVAE checkpoint to the ./saved_ckpt
folder. Let's say the name of the checkpoints are vqvae-snet-all.ckpt
or vqvae-bnet-all.ckpt
. This is necessary for training the Diffusion model. For SDFusion on various tasks, please see 2.~5. below.
- Train SDFusion on ShapeNet and BuildingNet
# ShapeNet
./launchers/train_sdfusion_snet.sh
# BuildingNet
./launchers/train_sdfusion_bnet.sh
- Train SDFusion for single-view reconstruction
./launchers/train_sdfusion_img2shape.sh
- Train SDFusion for text-guided shape generation
# text2shape
./launchers/train_sdfusion_txt2shape.sh
- Train SDFusion for multi-modality shape generation
./launchers/train_sdfusion_mm2shape.sh
- Train the text-guided texturization
coming soon!
If you find this code helpful, please consider citing:
- Conference version
@inproceedings{cheng2023sdfusion,
author={Cheng, Yen-Chi and Lee, Hsin-Ying and Tuyakov, Sergey and Schwing, Alex and Gui, Liangyan},
title={{SDFusion}: Multimodal 3D Shape Completion, Reconstruction, and Generation},
booktitle={CVPR},
year={2023},
}
- arxiv version
@article{cheng2022sdfusion,
author = {Cheng, Yen-Chi and Lee, Hsin-Ying and Tuyakov, Sergey and Schwing, Alex and Gui, Liangyan},
title = {{SDFusion}: Multimodal 3D Shape Completion, Reconstruction, and Generation},
journal = {arXiv},
year = {2022},
}
Coming soon!
This code borrows heavely from LDM, AutoSDF, CycleGAN, stable dreamfusion, DISN. We thank the authors for their great work. The followings packages are required to compute the SDF: freeglut3, tbb.
This work is supported in part by NSF under Grants 2008387, 2045586, 2106825, MRI 1725729, and NIFA award 2020-67021-32799. Thanks to NVIDIA for providing a GPU for debugging.