Skip to content

Latest commit

 

History

History
170 lines (130 loc) · 9.68 KB

README.md

File metadata and controls

170 lines (130 loc) · 9.68 KB

Install | Training | Models | License | References

Official PyTorch boilerplate implementation of multi task learning of distance estimation, pose estimation, semantic segmentation, motion segmentation and 2D object detection methods invented by the Valeo Team, in particular for

OmniDet: Surround View Cameras based Multi-task Visual Perception Network for Autonomous Driving (RA-l + ICRA 2021 oral), Varun Ravi Kumar, Senthil Kumar Yogamani, Hazem Rashed, Ganesh Sistu, Christian Witt, Isabelle Leang, Stefan Milz, Patrick Mäder.

The quantitative results are not evaluated and compared to the baseline results from the OmniDet paper using the released weights due to the following reasons:

  • Distance estimation is trained using source frames with t-1 and t on 8k images compared to the internal images with a temporal sequence of t-1, t and t+1.
  • No hyperparameter tuning or NAS is performed.
  • Less training data (8k vs. internal dataset).
  • The novel contributions are held back due to IP reasons.
  • Velodyne LiDAR GT is not released yet for distance estimation.

This code serves as a boilerplate on which researchers can leverage the MTL framework and build upon it using our References. We have released the onnx model export scripts, which can be used to export and run these models on NVIDIA's Jetson AGX device.

Although self-supervised (i.e., trained only on monocular videos), OmniDet outperforms other-self, semi, and fully supervised methods on the KITTI dataset at the time of publishing. Furthermore, the MTL model can run in real-time. See References for more info on the different approaches.

Install

git clone https://github.com/valeoai/WoodScape.git
cd WoodScape

Requirements

pip3 install -r requirements.txt

Training

Training can be fired with any one the following commands:

python3 main.py --config data/params.yaml

./main.py

For training the distance estimation kindly generate the look up tables using

./generate_luts.py --config params.yaml

For the code related to evaluation of the perception tasks check the eval folder scripts.

Models

WoodScape Boilerplate Weights

ResNet18, 544x288

ResNet50, 544x288

License

This code is released under the Apache 2.0 license.

References

OmniDet Surround View fisheye cameras are commonly deployed in automated driving for 360° near-field sensing around the vehicle. This work presents a multi-task visual perception network on unrectified fisheye images to enable the vehicle to sense its surrounding environment. It consists of six primary tasks necessary for an autonomous driving system: depth estimation, visual odometry, semantic segmentation, motion segmentation, object detection, and lens soiling detection. The output from the network is scale aware based on our FisheyeDistanceNet (ICRA 2020).

Please use the following citations when referencing our work:

OmniDet: Surround View Cameras based Multi-task Visual Perception Network for Autonomous Driving (RA-L + ICRA 2021 oral)
Varun Ravi Kumar, Senthil Yogamani, Hazem Rashed, Ganesh Sistu, Christian Witt, Isabelle Leang, Stefan Milz and Patrick Mäder, [paper], [video], [oral_talk], [site]

@inproceedings{omnidet,
  author    = {Varun Ravi Kumar and Senthil Kumar Yogamani and Hazem Rashed and Ganesh Sistu and Christian Witt and Isabelle Leang and Stefan Milz and Patrick Mäder},
  title     = {OmniDet: Surround View Cameras Based Multi-Task Visual Perception
               Network for Autonomous Driving},
  journal   = {{IEEE} Robotics Automation Letter},
  volume    = {6},
  number    = {2},
  pages     = {2830--2837},
  year      = {2021},
  url       = {https://doi.org/10.1109/LRA.2021.3062324},
  doi       = {10.1109/LRA.2021.3062324}
}

FisheyeDistanceNet: Self-Supervised Scale-Aware Distance Estimation using Monocular Fisheye Camera for Autonomous Driving (ICRA 2020 oral)
Varun Ravi Kumar, Sandesh Athni Hiremath, Markus Bach, Stefan Milz, Christian Witt, Clément Pinard, Senthil Yogamani and Patrick Mäder, [paper], [video], [oral_talk], [site]

@inproceedings{fisheyedistancenet,
  author    = {Varun Ravi Kumar and Sandesh Athni Hiremath and Markus Bach and Stefan Milz and Christian Witt and Cl{\'{e}}ment Pinard and Senthil Kumar Yogamani and Patrick Mäder},
  title     = {FisheyeDistanceNet: Self-Supervised Scale-Aware Distance Estimation
               using Monocular Fisheye Camera for Autonomous Driving},
  booktitle = {2020 {IEEE} International Conference on Robotics and Automation, {ICRA} 2020, Paris, France, May 31 - August 31, 2020},
  pages     = {574--581},
  publisher = {{IEEE}},
  year      = {2020},
  url       = {https://doi.org/10.1109/ICRA40945.2020.9197319},
  doi       = {10.1109/ICRA40945.2020.9197319},
}

SynDistNet: Self-Supervised Monocular Fisheye Camera Distance Estimation Synergized with Semantic Segmentation for Autonomous Driving (WACV 2021 oral)
Varun Ravi Kumar, Marvin Klingner, Senthil Yogamani, Stefan Milz, Tim Fingscheidt and Patrick Mäder, [paper], [oral_talk], [site]

@inproceedings{syndistnet,
  author    = {Varun Ravi Kumar and Marvin Klingner and Senthil Stefan Milz and
               Tim Fingscheidt and Patrick Mäder},
  title     = {SynDistNet: Self-Supervised Monocular Fisheye Camera Distance Estimation Synergized with Semantic Segmentation for Autonomous Driving},
  booktitle = {{IEEE} Winter Conference on Applications of Computer Vision, {WACV}
               2021, Waikoloa, HI, USA, January 3-8, 2021},
  pages     = {61--71},
  publisher = {{IEEE}},
  year      = {2021},
  url       = {https://doi.org/10.1109/WACV48630.2021.00011},
}

UnRectDepthNet: Self-Supervised Monocular Depth Estimation using a Generic Framework for Handling Common Camera Distortion Models (IROS 2020 oral)
Varun Ravi Kumar, Senthil Yogamani, Markus Bach, Christian Witt, Stefan Milz, Patrick Mäder, [paper], [video], [oral_talk], [site]

@inproceedings{unrectdepthnet,
  author    = {Varun Ravi Kumar and Senthil Kumar Yogamani and Markus Bach and Christian Witt and Stefan Milz and Patrick Mäder},
  title     = {UnRectDepthNet: Self-Supervised Monocular Depth Estimation using a
               Generic Framework for Handling Common Camera Distortion Models},
  booktitle = {{IEEE/RSJ} International Conference on Intelligent Robots and Systems, {IROS} 2020, Las Vegas, NV, USA, October 24, 2020 - January 24, 2021},
  pages     = {8177--8183},
  publisher = {{IEEE}},
  year      = {2020},
  url       = {https://doi.org/10.1109/IROS45743.2020.9340732},
}

SVDistNet: Self-Supervised Near-Field Distance Estimation on Surround View Fisheye Cameras (Journal- T-ITS 2021)
Varun Ravi Kumar, Senthil Yogamani, Markus Bach, Christian Witt, Stefan Milz, Patrick Mäder, [paper], [video], [site]

@inproceedings{svdistnet,
  author    = {Varun Ravi Kumar and Marvin Klingner and Senthil Kumar Yogamani and Markus Bach and Stefan Milz and Tim Fingscheidt and Patrick Mäder},
  journal   = {IEEE Transactions on Intelligent Transportation Systems}, 
  title     = {SVDistNet: Self-Supervised Near-Field Distance Estimation on Surround View Fisheye Cameras}, 
  year      = {2021},
  volume    = {},
  number    = {},
  pages     = {1-10},
  doi       = {10.1109/TITS.2021.3088950}}
}

FisheyeYOLO: Generalized Object Detection on Fisheye Cameras for Autonomous Driving: Dataset, Representations and Baseline (WACV 2021 oral)
Hazem Rashed, Eslam Mohamed, Ganesh Sistu, Varun Ravi Kumar, Ciaran Eising, Ahmad El-Sallab, Senthil Yogamani, [paper], [video], [site]

@inproceedings{fisheyeyolo,
  author    = {Rashed, Hazem and Mohamed, Eslam and Sistu, Ganesh and Kumar, Varun Ravi and Eising, Ciaran and El-Sallab, Ahmad and Yogamani, Senthil},
  title     = {Generalized object detection on fisheye cameras for autonomous driving: Dataset, representations and baseline},
  booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision},
  pages     = {2272--2280},
  year      = {2021}
}