This network builds upon Cylinder3D from Zhu et.al. It utilizes self-attention blocks from CodedVTR, and implements a softmax temperature annealing which declines with advancing epochs. Multi-Head Attention is adjusted to the convolution channels of the underlying, 3D-Unet similar, structure of the network.
git clone https://github.com/nerovalerius/AttentiveCylinder3D.git
conda install python=3.9.2 numpy tqdm pyyaml numba strictyaml -c conda-forge
wget https://developer.download.nvidia.com/compute/cuda/11.7.0/local_installers/cuda_11.7.0_515.43.04_linux.run
sudo sh cuda_11.7.0_515.43.04_linux.run
conda install pytorch torchvision torchaudio cudatoolkit=11.6 -c pytorch -c conda-forge
Version "cu116" was not available at the time, however cu114 also works.
pip install spconv-cu114
pip install torch-sparse -f https://data.pyg.org/whl/torch-1.12.0%2Bcu116.html
pip install torch-scatter -f https://data.pyg.org/whl/torch-1.12.0%2Bcu116.html
- nuScenes-devkit (optional for nuScenes)
I strongly recommend that you use docker. This docker mounts a workspace where this git repo should be cloned in.
In comparison to the package versions described in this readme, the docker uses some newer versions.
Adapt your workspace inside build_docker.sh
and then run sh build_docker.sh
.
Simply run ./start_docker.sh
to get the docker up and running. Afterwards, inside the docker, install the Minkowski Engine with ./install_minkowsk.sh
.
Then you could either start jupyter notebooks with ./start_jupyter.sh
, or convert the notebook to a python file for training without jupyter:
````jupyter nbconvert train_attentivecylinder3d.ipynb --to python```.
./
├──
├── ...
└── path_to_data_shown_in_config/
├──sequences
├── 00/
│ ├── velodyne/
| | ├── 000000.bin
| | ├── 000001.bin
| | └── ...
│ └── labels/
| ├── 000000.label
| ├── 000001.label
| └── ...
├── 08/ # for validation
├── 11/ # 11-21 for testing
└── 21/
└── ...
./
├──
├── ...
└── path_to_data_shown_in_config/
├──v1.0-trainval
├──v1.0-test
├──samples
├──sweeps
├──maps
Use sh run_docker
to start an interactive docker container.
There is also a script to start a jupyter lab instance on port 12212. Just run sh start_jupyter.sh
inside your -it docker workspace/attentivecylinder3d/ and open your (remote browser).
The main file to work with is train_cylinder_asym_jupyter.ipynb
.
python train_attentivecylinder3d.py
- modify
config/semantickitti.yaml
with your custom settings. We provide a sample yaml for SemanticKITTI - train the network by running
python train_attentivecylinder3d.py
Please refer to NUSCENES-GUIDE
-- SemanticKITTI LINK1 or LINK2 (access code: xqmi)
-- For nuScenes dataset, please refer to NUSCENES-GUIDE
Set the correct model folders for save and load inside config/semantickitti.yaml
.
python demo_folder.py --demo-folder YOUR_FOLDER --save-folder YOUR_SAVE_FOLDER
If you want to validate with your own datasets, you need to provide labels. --demo-label-folder is optional
python demo_folder.py --demo-folder YOUR_FOLDER --save-folder YOUR_SAVE_FOLDER --demo-label-folder YOUR_LABEL_FOLDER
python demo_folder.py --demo-folder ../dataset/sequences/00/velodyne/ --demo-label-folder ../dataset/sequences/00/labels/ --save-folder save_folder/
python demo_folder.py --demo-folder /home/nero/master/dataset/sequences/00/velodyne/ --save-folder save_folder/
python demo_folder.py --demo-folder /home/nero/semanticKITTI/dataset/sequences/00/velodyne/ --save-folder save_folder/ --demo-label-folder home/nero/semanticKITTI/dataset/sequences/00/labels/
git clone https://github.com/PRBonn/semantic-kitti-api.git
Run ./content.py --directory dataset/ to achieve statistics about the labels inside the dataset. Adapt the semantic-kitti-api/config/semantic-kitti.yaml file beforehand or use the sbld.yaml file inside this repo under: attentivecylinder3d/config/label_mapping/sbld.yaml. The train/test split folders must match the folder structure of train/test if you have one.
This network mainly builds upon Cylinder3D from Zhu et.al.
If you find our their useful in your research, please consider citing their paper:
@article{zhu2020cylindrical,
title={Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR Segmentation},
author={Zhu, Xinge and Zhou, Hui and Wang, Tai and Hong, Fangzhou and Ma, Yuexin and Li, Wei and Li, Hongsheng and Lin, Dahua},
journal={arXiv preprint arXiv:2011.10033},
year={2020}
}
Furthermore, the transformer blocks from CodedVTR are used in this work, which is based on SpatioTemporalSegmentation-ScanNet.
@inproceedings{zhao2022codedvtr, title={CodedVTR: Codebook-based Sparse Voxel Transformer with Geometric Guidance}, author={Zhao, Tianchen and Zhang, Niansong and Ning, Xuefei and Wang, He and Yi, Li and Wang, Yu}, booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition}, pages={1435--1444}, year={2022} }