This is the repo for the series works on Neural Mesh Models. In this repo, we implement 3D object pose estimation, 3D object pose estimation via VoGE renderer, 6D pose object estimation, object classification, and cross domain training. The original implementation of NeMo is here.
Introduce a major refactor, main for feature banks and "mask remove" functions. In the new implementation, the feature banks support multiple objects in each training image with different class labels. Note the implementation of multiple classes is CUDA-based, which requires installation of the specific CUDA layer. (Running 3D pose only should be fine without these CUDA layers.) To install:
cd cu_layers python setup.py install
After the installation, you will find a lib named "CuNeMo" in your Python libs. Previous configs should be compatible except for changes in config/model
memory_bank: class_name: nemo.models.feature_banks.FeatureBankNeMo
The previous implementation of classification NeMo is removed, we will add support for classification NeMo very soon. Contact me directly if you find any bugs or compatibility issues.
Easily train and evaluate neural mesh models for multiple tasks:
- 3D pose estimation
- 6D poes estimation
- 3D-aware image classification
- Amodal segmenation
Experiment on various benchmark datasets:
- PASCAL3D+
- Occluded PASCAL3D+
- ObjectNet3D
- OOD-CV
- SyntheticPASCAL3D+
Reproduce baseline models for fair comparison:
- Regression-based models (ResNet50, Faster R-CNN, etc.)
- Transformers
- StarMap
- Create
conda
environment:
conda create -n nemo python=3.9 conda activate nemo
- Install
PyTorch
(see pytorch.org):
conda install pytorch==1.12.0 torchvision==0.13.0 torchaudio==0.12.0 cudatoolkit=10.2 -c pytorch
- Install
PyTorch3D
(see github.com/facebookresearch/pytorch3d):
conda install -c fvcore -c iopath -c conda-forge fvcore iopath conda install -c bottler nvidiacub conda install pytorch3d -c pytorch3d
- Install other dependencies:
conda install numpy matplotlib scipy scikit-image conda install pillow conda install -c conda-forge timm tqdm pyyaml transformers pip install git+https://github.com/NVlabs/nvdiffrast/ pip install wget gdown BboxTools opencv-python xatlas pycocotools seaborn wandb
- (Optional) Install
VoGE
(see github.com/Angtian/VoGE):
pip install git+https://github.com/Angtian/VoGE.git
In case the previous method failed, setup the environment from a compiled list of packages:
conda env create -f environment.yml pip install git+https://github.com/NVlabs/nvdiffrast/ pip install -e .
See data/README.
Train and evaluate a neural mesh model (NeMo
) on PASCAL3D+ for 3D pose estimation:
CUDA_VISIBLE_DEVICES=0,1,2,3 python3 scripts/train.py \ --cate car \ --config config/omni_nemo_pose_3d.yaml \ --save_dir exp/pose_estimation_3d_nemo_car CUDA_VISIBLE_DEVICES=0 python3 scripts/inference.py \ --cate car \ --config config/omni_nemo_pose_3d.yaml \ --save_dir exp/pose_estimation_3d_nemo_car \ --checkpoint exp/pose_estimation_3d_nemo_car/ckpts/model_800.pth
NeMo with VoGE:
CUDA_VISIBLE_DEVICES=0,1,2,3 python3 scripts/train.py \ --cate car \ --config config/omni_voge_pose_3d.yaml \ --save_dir exp/pose_estimation_3d_voge_car CUDA_VISIBLE_DEVICES=0 python3 scripts/inference.py \ --cate car \ --config config/omni_voge_pose_3d.yaml \ --save_dir exp/pose_estimation_3d_voge_car \ --checkpoint exp/pose_estimation_3d_voge_car/ckpts/model_800.pth
NeMo on PASCAL3D+ without scaling during data pre-processing:
CUDA_VISIBLE_DEVICES=0,1,2,3 python3 scripts/train.py \ --cate car \ --config config/omni_nemo_pose_3d_ori.yaml \ --save_dir exp/pose_estimation_3d_ori_car CUDA_VISIBLE_DEVICES=0 python3 scripts/inference.py \ --cate car \ --config config/omni_nemo_pose_3d_ori.yaml \ --save_dir exp/pose_estimation_3d_ori_car \ --checkpoint exp/pose_estimation_3d_ori_car/ckpts/model_800.pth
Train and evaluate a regression-based model (ResNet50-General
) on PASCAL3D+ for 3D pose estimation:
CUDA_VISIBLE_DEVICES=0 python3 scripts/train.py \ --cate all \ --config config/pose_estimation_3d_resnet50_general.yaml \ --save_dir exp/pose_estimation_3d_resnet50_general_car CUDA_VISIBLE_DEVICES=0 python3 scripts/inference.py \ --cate car \ --config config/pose_estimation_3d_resnet50_general.yaml \ --save_dir exp/pose_estimation_3d_resnet50_general \ --checkpoint exp/pose_estimation_3d_resnet50_general/ckpts/model_90.pth
The pre-trained model for NeMo model:
https://drive.google.com/file/d/14fByOZs_Zzd-97Ulk2BKJhVNFKAnFWvg/view?usp=sharing
3D pose | plane | bike | boat | bottle | bus | car | chair | table | mbike | sofa | train | tv | Mean |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Pi/6 Pi/18 Med | 86.9 55.3 8.94 | 80.3 30.9 15.51 | 77.4 50.2 9.95 | 90.0 56.9 8.24 | 95.3 91.5 2.66 | 98.9 96.5 2.71 | 89.1 56.7 8.68 | 80.2 63.1 6.96 | 86.6 33.2 13.34 | 95.8 65.9 7.18 | 64.4 55.3 7.32 | 82.0 48.6 10.61 | 87.4 65.5 7.42 |
The pre-trained model for NeMo-VoGE model:
https://drive.google.com/file/d/1kogFdjVbOIuSlKx1NQ1c1XEjbvJEQWJg/view?usp=sharing
3D pose | plane | bike | boat | bottle | bus | car | chair | table | mbike | sofa | train | tv | Mean |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Pi/6 Pi/18 Med | 87.8 62.3 7.57 | 82.9 36.7 14.02 | 75.4 51.0 9.7 | 88.2 55.2 9.1 | 97.4 94.5 2.38 | 99.0 96.4 2.89 | 90.7 54.9 8.96 | 83.6 69.7 5.7 | 87.4 39.1 12.3 | 94.4 65.4 7.77 | 91.3 83.3 3.84 | 80.5 54.4 8.80 | 89.5 69.5 6.82 |
The pre-trained model for NeMo model without scaling:
https://drive.google.com/file/d/1ybVTDx6DvV_H01SUZkKqWQjKu-BfweGJ/view?usp=sharing
3D pose | plane | bike | boat | bottle | bus | car | chair | table | mbike | sofa | train | tv | Mean |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Pi/6 Pi/18 Med | 83.0 48.0 10.62 | 75.7 24.7 18.54 | 68.3 34.0 14.97 | 84.5 44.3 11.67 | 96.2 90.0 3.00 | 98.8 95.4 3.12 | 85.8 44.6 11.01 | 80.4 58.5 8.07 | 78.1 26.6 15.22 | 94.6 58.8 8.31 | 79.2 64.0 6.65 | 85.8 45.6 11.25 | 86.0 60.2 8.99 |
The pre-trained model for NeMo-VoGE model without scaling:
https://drive.google.com/file/d/10ggpneADVWClXWx42yQeJ_unFt53oQ1I/view?usp=sharing
3D pose | plane | bike | boat | bottle | bus | car | chair | table | mbike | sofa | train | tv | Mean |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Pi/6 Pi/18 Med | 83.1 51.9 9.56 | 80.2 29.9 16.33 | 68.1 36.3 14.97 | 83.9 44.6 11.07 | 98.1 94.2 2.92 | 98.3 93.2 3.75 | 89.0 50.1 9.97 | 83.0 65.0 6.70 | 81.8 32.8 14.06 | 94.1 61.4 8.03 | 90.5 76.1 5.45 | 83.7 46.4 10.70 | 87.4 62.9 8.51 |
See documentation.
@inproceedings{wang2021nemo, title={NeMo: Neural Mesh Models of Contrastive Features for Robust 3D Pose Estimation}, author={Angtian Wang and Adam Kortylewski and Alan Yuille}, booktitle={International Conference on Learning Representations}, year={2021}, url={https://openreview.net/forum?id=pmj131uIL9H} } @software{nemo_code_2022, title={Neural Mesh Models for 3D Reasoning}, author={Ma, Wufei and Jesslen, Artur and Wang, Angtian}, month={12}, year={2022}, url={https://github.com/wufeim/NeMo}, version={1.0.0} }
This repo builds upon several previous works:
- NeMo: Neural Mesh Models of Contrastive Features for Robust 3D Pose Estimation (ICLR 2021)
- Robust Category-Level 6D Pose Estimation with Coarse-to-Fine Rendering of Neural Features (ECCV 2022)
In this project, we borrow codes from several other repos:
NeMo
by Angtian Wang in Angtian/NeMoDMTet
by NVIDIA in nv-tlabs/GET3Dtorch_utils
by NVIDIA in nv-tlabs/GET3Duni_rep
by NVIDIA in nv-tlabs/GET3Ddnnlib
by NVIDIA in nv-tlabs/GET3D