diff --git a/README.md b/README.md index b81993d3..f8f4847e 100644 --- a/README.md +++ b/README.md @@ -1,18 +1,27 @@ # nuScenes devkit -Welcome to the devkit of the [nuScenes](https://www.nuscenes.org) dataset. +Welcome to the devkit of the [nuScenes](https://www.nuscenes.org/nuscenes) and [nuImages](https://www.nuscenes.org/nuimages) datasets. ![](https://www.nuscenes.org/public/images/road.jpg) ## Overview - [Changelog](#changelog) -- [Dataset download](#dataset-download) -- [Map expansion](#map-expansion) - [Devkit setup](#devkit-setup) -- [Getting started](#getting-started) +- [nuImages](#nuimages) + - [nuImages setup](#nuimages-setup) + - [Getting started with nuImages](#getting-started-with-nuimages) +- [nuScenes](#nuscenes) + - [nuScenes setup](#nuscenes-setup) + - [nuScenes-lidarseg](#nuscenes-lidarseg) + - [Prediction challenge](#prediction-challenge) + - [CAN bus expansion](#can-bus-expansion) + - [Map expansion](#map-expansion) + - [Getting started with nuScenes](#getting-started-with-nuscenes) - [Known issues](#known-issues) - [Citation](#citation) ## Changelog +- Aug. 31, 2020: Devkit v1.1.0: nuImages v1.0 and nuScenes-lidarseg v1.0 code release. - Jul. 7, 2020: Devkit v1.0.9: Misc updates on map and prediction code. +- Apr. 30, 2020: nuImages v0.1 code release. - Apr. 1, 2020: Devkit v1.0.8: Relax pip requirements and reorganize prediction code. - Mar. 24, 2020: Devkit v1.0.7: nuScenes prediction challenge code released. - Feb. 12, 2020: Devkit v1.0.6: CAN bus expansion released. @@ -26,7 +35,49 @@ Welcome to the devkit of the [nuScenes](https://www.nuscenes.org) dataset. - Oct. 4, 2018: Code to parse RADAR data released. - Sep. 12, 2018: Devkit for teaser dataset released. -## Dataset download +## Devkit setup +We use a common devkit for nuScenes and nuImages. +The devkit is tested for Python 3.6 and Python 3.7. +To install Python, please check [here](https://github.com/nutonomy/nuscenes-devkit/blob/master/docs/installation.md#install-python). + +Our devkit is available and can be installed via [pip](https://pip.pypa.io/en/stable/installing/) : +``` +pip install nuscenes-devkit +``` +For an advanced installation, see [installation](https://github.com/nutonomy/nuscenes-devkit/blob/master/docs/installation.md) for detailed instructions. + +## nuImages +nuImages is a stand-alone large-scale image dataset. +It uses the same sensor setup as the 3d nuScenes dataset. +The structure is similar to nuScenes and both use the same devkit, which make the installation process simple. + +### nuImages setup +To download nuImages you need to go to the [Download page](https://www.nuscenes.org/download), +create an account and agree to the nuScenes [Terms of Use](https://www.nuscenes.org/terms-of-use). +For the devkit to work you will need to download *at least the metadata and samples*, the *sweeps* are optional. +Please unpack the archives to the `/data/sets/nuimages` folder \*without\* overwriting folders that occur in multiple archives. +Eventually you should have the following folder structure: +``` +/data/sets/nuimages + samples - Sensor data for keyframes (annotated images). + sweeps - Sensor data for intermediate frames (unannotated images). + v1.0-* - JSON tables that include all the meta data and annotations. Each split (train, val, test, mini) is provided in a separate folder. +``` +If you want to use another folder, specify the `dataroot` parameter of the NuImages class (see tutorial). + +### Getting started with nuImages + +Please follow these steps to make yourself familiar with the nuImages dataset: +- Get the [nuscenes-devkit code](https://github.com/nutonomy/nuscenes-devkit). +- Run the tutorial using: +``` +jupyter notebook $HOME/nuscenes-devkit/python-sdk/tutorials/nuimages_tutorial.ipynb +``` +- See the [database schema](https://github.com/nutonomy/nuscenes-devkit/blob/master/docs/schema_nuimages.md) and [annotator instructions](https://github.com/nutonomy/nuscenes-devkit/blob/master/docs/instructions_nuimages.md). + +## nuScenes + +### nuScenes setup To download nuScenes you need to go to the [Download page](https://www.nuscenes.org/download), create an account and agree to the nuScenes [Terms of Use](https://www.nuscenes.org/terms-of-use). After logging in you will see multiple archives. @@ -42,7 +93,16 @@ Eventually you should have the following folder structure: ``` If you want to use another folder, specify the `dataroot` parameter of the NuScenes class (see tutorial). -## Prediction Challenge +### nuScenes-lidarseg +In August 2020 we published [nuScenes-lidarseg](https://www.nuscenes.org/nuscenes#lidarseg) which contains the semantic labels of the point clouds for the approximately 40,000 keyframes in nuScenes. +To install nuScenes-lidarseg, please follow these steps: +- Download the dataset from the [Download page](https://www.nuscenes.org/download), +- Extract the `lidarseg` and `v1.0-*` folders to your nuScenes root directory (e.g. `/data/sets/nuscenes/lidarseg`, `/data/sets/nuscenes/v1.0-*`). +- Get the latest version of the nuscenes-devkit. +- If you already have a previous version of the devkit, update the pip requirements (see [details](https://github.com/nutonomy/nuscenes-devkit/blob/master/setup/installation.md)): `pip install -r setup/requirements.txt` +- Get started with the [tutorial](https://github.com/nutonomy/nuscenes-devkit/blob/master/python-sdk/tutorials/nuscenes_lidarseg_tutorial.ipynb). + +### Prediction challenge In March 2020 we released code for the nuScenes prediction challenge. To get started: - Download the version 1.2 of the map expansion (see below). @@ -50,51 +110,41 @@ To get started: - Go through the [prediction tutorial](https://github.com/nutonomy/nuscenes-devkit/blob/master/python-sdk/tutorials/prediction_tutorial.ipynb). - For information on how submissions will be scored, visit the challenge [website](https://www.nuscenes.org/prediction). -## CAN bus expansion +### CAN bus expansion In February 2020 we published the CAN bus expansion. It contains low-level vehicle data about the vehicle route, IMU, pose, steering angle feedback, battery, brakes, gear position, signals, wheel speeds, throttle, torque, solar sensors, odometry and more. To install this expansion, please follow these steps: - Download the expansion from the [Download page](https://www.nuscenes.org/download), -- Move the can_bus folder to your nuScenes root directory (e.g. `/data/sets/nuscenes/can_bus`). +- Extract the can_bus folder to your nuScenes root directory (e.g. `/data/sets/nuscenes/can_bus`). - Get the latest version of the nuscenes-devkit. -- If you already have a previous version of the devkit, update the pip requirements (see [details](https://github.com/nutonomy/nuscenes-devkit/blob/master/setup/installation.md)): `pip install -r setup/requirements.txt` +- If you already have a previous version of the devkit, update the pip requirements (see [details](https://github.com/nutonomy/nuscenes-devkit/blob/master/docs/installation.md)): `pip install -r setup/requirements.txt` - Get started with the [CAN bus readme](https://github.com/nutonomy/nuscenes-devkit/blob/master/python-sdk/nuscenes/can_bus/README.md) or [tutorial](https://github.com/nutonomy/nuscenes-devkit/blob/master/python-sdk/tutorials/can_bus_tutorial.ipynb). -## Map expansion +### Map expansion In July 2019 we published a map expansion with 11 semantic layers (crosswalk, sidewalk, traffic lights, stop lines, lanes, etc.). To install this expansion, please follow these steps: - Download the expansion from the [Download page](https://www.nuscenes.org/download), -- Move the .json files to your nuScenes `maps` folder. +- Extract the .json files to your nuScenes `maps` folder. - Get the latest version of the nuscenes-devkit. -- If you already have a previous version of the devkit, update the pip requirements (see [details](https://github.com/nutonomy/nuscenes-devkit/blob/master/setup/installation.md)): `pip install -r setup/requirements.txt` +- If you already have a previous version of the devkit, update the pip requirements (see [details](https://github.com/nutonomy/nuscenes-devkit/blob/master/docs/installation.md)): `pip install -r setup/requirements.txt` - Get started with the [map expansion tutorial](https://github.com/nutonomy/nuscenes-devkit/blob/master/python-sdk/tutorials/map_expansion_tutorial.ipynb). -## Devkit setup -The devkit is tested for Python 3.6 and Python 3.7. -To install Python, please check [here](https://github.com/nutonomy/nuscenes-devkit/blob/master/setup/installation.md#install-python). - -Our devkit is available and can be installed via [pip](https://pip.pypa.io/en/stable/installing/) : -``` -pip install nuscenes-devkit -``` -For an advanced installation, see [installation](https://github.com/nutonomy/nuscenes-devkit/blob/master/setup/installation.md) for detailed instructions. - -## Getting started +### Getting started with nuScenes Please follow these steps to make yourself familiar with the nuScenes dataset: -- Read the [dataset description](https://www.nuscenes.org/overview). -- [Explore](https://www.nuscenes.org/explore/scene-0011/0) the lidar viewer and videos. +- Read the [dataset description](https://www.nuscenes.org/nuscenes#overview). +- [Explore](https://www.nuscenes.org/nuscenes#explore) the lidar viewer and videos. - [Download](https://www.nuscenes.org/download) the dataset. - Get the [nuscenes-devkit code](https://github.com/nutonomy/nuscenes-devkit). -- Read the [online tutorial](https://www.nuscenes.org/tutorial) or run it yourself using: +- Read the [online tutorial](https://www.nuscenes.org/nuscenes#tutorials) or run it yourself using: ``` -jupyter notebook $HOME/nuscenes-devkit/python-sdk/tutorials/nuscenes_basics_tutorial.ipynb +jupyter notebook $HOME/nuscenes-devkit/python-sdk/tutorials/nuscenes_tutorial.ipynb ``` - Read the [nuScenes paper](https://www.nuscenes.org/publications) for a detailed analysis of the dataset. - Run the [map expansion tutorial](https://github.com/nutonomy/nuscenes-devkit/blob/master/python-sdk/tutorials/map_expansion_tutorial.ipynb). - Take a look at the [experimental scripts](https://github.com/nutonomy/nuscenes-devkit/tree/master/python-sdk/nuscenes/scripts). - For instructions related to the object detection task (results format, classes and evaluation metrics), please refer to [this readme](https://github.com/nutonomy/nuscenes-devkit/blob/master/python-sdk/nuscenes/eval/detection/README.md). -- See the [database schema](https://github.com/nutonomy/nuscenes-devkit/blob/master/schema.md) and [annotator instructions](https://github.com/nutonomy/nuscenes-devkit/blob/master/instructions.md). -- See the [FAQs](https://github.com/nutonomy/nuscenes-devkit/blob/master/faqs.md). +- See the [database schema](https://github.com/nutonomy/nuscenes-devkit/blob/master/docs/schema_nuscenes.md) and [annotator instructions](https://github.com/nutonomy/nuscenes-devkit/blob/master/docs/instructions_nuscenes.md). +- See the [FAQs](https://github.com/nutonomy/nuscenes-devkit/blob/master/docs/faqs.md). ## Known issues Great care has been taken to collate the nuScenes dataset and many users have praised the quality of the data and annotations. @@ -109,7 +159,7 @@ However, some minor issues remain: - A small number of 3d bounding boxes is annotated despite the object being temporarily occluded. For this reason we make sure to **filter objects without lidar or radar points** in the nuScenes benchmarks. See [issue 366](https://github.com/nutonomy/nuscenes-devkit/issues/366). ## Citation -Please use the following citation when referencing [nuScenes](https://arxiv.org/abs/1903.11027): +Please use the following citation when referencing [nuScenes or nuImages](https://arxiv.org/abs/1903.11027): ``` @article{nuscenes2019, title={nuScenes: A multimodal dataset for autonomous driving}, diff --git a/faqs.md b/docs/faqs.md similarity index 87% rename from faqs.md rename to docs/faqs.md index 4204317b..539aeed9 100644 --- a/faqs.md +++ b/docs/faqs.md @@ -6,17 +6,14 @@ On this page we try to answer questions frequently asked by our users. - For issues and bugs *with the devkit*, file an issue on [Github](https://github.com/nutonomy/nuscenes-devkit/issues). - For any other questions, please post in the [nuScenes user forum](https://forum.nuscenes.org/). -- Can I use nuScenes for free? - - For non-commercial use [nuScenes is free](https://www.nuscenes.org/terms-of-use), e.g. for educational use and some research use. +- Can I use nuScenes and nuImages for free? + - For non-commercial use [nuScenes and nuImages are free](https://www.nuscenes.org/terms-of-use), e.g. for educational use and some research use. - For commercial use please contact [nuScenes@nuTonomy.com](mailto:nuScenes@nuTonomy.com). To allow startups to use our dataset, we adjust the pricing terms to the use case and company size. - How can I participate in the nuScenes challenges? - See the overview site for the [object detection challenge](https://www.nuscenes.org/object-detection). - See the overview site for the [tracking challenge](https://www.nuscenes.org/tracking). - -- What's next for nuScenes? - - Raw IMU & GPS data. - - Object detection, tracking and other challenges (see above). + - See the overview site for the [prediction challenge](https://www.nuscenes.org/prediction). - How can I get more information on the sensors used? - Read the [Data collection](https://www.nuscenes.org/data-collection) page. diff --git a/setup/installation.md b/docs/installation.md similarity index 71% rename from setup/installation.md rename to docs/installation.md index 8301f385..645317c2 100644 --- a/setup/installation.md +++ b/docs/installation.md @@ -1,13 +1,14 @@ # Advanced Installation -We provide step-by-step instructions to install our devkit. +We provide step-by-step instructions to install our devkit. These instructions apply to both nuScenes and nuImages. - [Download](#download) - [Install Python](#install-python) - [Setup a Conda environment](#setup-a-conda-environment) - [Setup a virtualenvwrapper environment](#setup-a-virtualenvwrapper-environment) - [Setup PYTHONPATH](#setup-pythonpath) - [Install required packages](#install-required-packages) +- [Setup environment variable](#setup-environment-variable) +- [Setup Matplotlib backend](#setup-matplotlib-backend) - [Verify install](#verify-install) -- [Setup NUSCENES environment variable](#setup-nuscenes-environment-variable) ## Download @@ -36,7 +37,7 @@ An alternative to Conda is to use virtualenvwrapper, as described [below](#setup See the [official Miniconda page](https://conda.io/en/latest/miniconda.html). #### Setup a Conda environment -We create a new Conda environment named `nuscenes`. +We create a new Conda environment named `nuscenes`. We will use this environment for both nuScenes and nuImages. ``` conda create --name nuscenes python=3.7 ``` @@ -103,16 +104,33 @@ To install the required packages, run the following command in your favourite vi ``` pip install -r setup/requirements.txt ``` +**Note:** The requirements file is internally divided into base requirements (`base`) and requirements specific to certain products or challenges (`nuimages`, `prediction` and `tracking`). If you only plan to use a subset of the codebase, feel free to comment out the lines that you do not need. -## Verify install -To verify your environment run `python -m unittest` in the `python-sdk` folder. -You can also run `assert_download.py` in the `nuscenes/scripts` folder. - -## Setup NUSCENES environment variable +## Setup environment variable Finally, if you want to run the unit tests you need to point the devkit to the `nuscenes` folder on your disk. -Set the NUSCENES environment variable to point to your data folder, e.g. `/data/sets/nuscenes`: +Set the NUSCENES environment variable to point to your data folder: ``` export NUSCENES="/data/sets/nuscenes" ``` +or for NUIMAGES: +``` +export NUIMAGES="/data/sets/nuimages" +``` + +## Setup Matplotlib backend +When using Matplotlib, it is generally recommended to define the backend used for rendering: +1) Under Ubuntu the default backend `Agg` results in any plot not being rendered by default. This does not apply inside Jupyter notebooks. +2) Under MacOSX a call to `plt.plot()` may fail with the following error (see [here](https://github.com/matplotlib/matplotlib/issues/13414) for more details): + ``` + libc++abi.dylib: terminating with uncaught exception of type NSException + ``` +To set the backend, add the following to your `~/.matplotlib/matplotlibrc` file, which needs to be created if it does not exist yet: +``` +backend: TKAgg +``` + +## Verify install +To verify your environment run `python -m unittest` in the `python-sdk` folder. +You can also run `assert_download.py` in the `python-sdk/nuscenes/tests` and `python-sdk/nuimages/tests` folders to verify that all files are in the right place. -That's it you should be good to go! \ No newline at end of file +That's it you should be good to go! diff --git a/docs/instructions_lidarseg.md b/docs/instructions_lidarseg.md new file mode 100644 index 00000000..e4dd9492 --- /dev/null +++ b/docs/instructions_lidarseg.md @@ -0,0 +1,149 @@ +# nuScenes-lidarseg Annotator Instructions + +# Overview +- [Introduction](#introduction) +- [General Instructions](#general-instructions) +- [Detailed Instructions](#detailed-instructions) +- [Classes](#classes) + +# Introduction +In nuScenes-lidarseg, we annotate every point in the lidar pointcloud with a semantic label. +All the labels from nuScenes are carried over into nuScenes-lidarseg; in addition, more ["stuff" (background) classes](#classes) have been included. +Thus, nuScenes-lidarseg contains both foreground classes (pedestrians, vehicles, cyclists, etc.) and background classes (driveable surface, nature, buildings, etc.). + + +# General Instructions + - Label each point with a class. + - Use the camera images to facilitate, check and validate the labels. + - Each point belongs to only one class, i.e., one class per point. + + +# Detailed Instructions ++ **Extremities** such as vehicle doors, car mirrors and human limbs should be assigned the same label as the object. +Note that in contrast to the nuScenes 3d cuboids, the lidarseg labels include car mirrors and antennas. ++ **Minimum number of points** + + An object can have as little as **one** point. + In such cases, that point should only be labeled if it is certain that the point belongs to a class + (with additional verification by looking at the corresponding camera frame). + Otherwise, the point should be labeled as `static.other`. ++ **Other static object vs noise.** + + **Other static object:** Points that belong to some physical object, but are not defined in our taxonomy. + + **Noise:** Points that do not correspond to physical objects or surfaces in the environment + (e.g. noise, reflections, dust, fog, raindrops or smoke). ++ **Terrain vs other flat.** + + **Terrain:** Grass, all kinds of horizontal vegetation, soil or sand. These areas are not meant to be driven on. + This label includes a possibly delimiting curb. + Single grass stalks do not need to be annotated and get the label of the region they are growing on. + + Short bushes / grass with **heights of less than 20cm**, should be labeled as terrain. + Similarly, tall bushes / grass which are higher than 20cm should be labeled as vegetation. + + **Other flat:** Horizontal surfaces which cannot be classified as ground plane / sidewalk / terrain, e.g., water. ++ **Terrain vs sidewalk** + + **Terrain:** See above. + + **Sidewalk:** A sidewalk is a walkway designed for pedestrians and / or cyclists. Sidewalks are always paved. + + +# Classes +The following classes are in **addition** to the existing ones in nuScenes: + +| Label ID | Label | Short Description | +| --- | --- | --- | +| 0 | [`noise`](#1-noise-class-0) | Any lidar return that does not correspond to a physical object, such as dust, vapor, noise, fog, raindrops, smoke and reflections. | +| 24 | [`flat.driveable_surface`](#2-flatdriveable_surface-class-24) | All paved or unpaved surfaces that a car can drive on with no concern of traffic rules. | +| 25 | [`flat.sidewalk`](#3-flatsidewalk-class-25) | Sidewalk, pedestrian walkways, bike paths, etc. Part of the ground designated for pedestrians or cyclists. Sidewalks do **not** have to be next to a road. | +| 26 | [`flat.terrain`](#4-flatterrain-class-26) | Natural horizontal surfaces such as ground level horizontal vegetation (< 20 cm tall), grass, rolling hills, soil, sand and gravel. | +| 27 | [`flat.other`](#5-flatother-class-27) | All other forms of horizontal ground-level structures that do not belong to any of driveable_surface, curb, sidewalk and terrain. Includes elevated parts of traffic islands, delimiters, rail tracks, stairs with at most 3 steps and larger bodies of water (lakes, rivers). | +| 28 | [`static.manmade`](#6-staticmanmade-class-28) | Includes man-made structures but not limited to: buildings, walls, guard rails, fences, poles, drainages, hydrants, flags, banners, street signs, electric circuit boxes, traffic lights, parking meters and stairs with more than 3 steps. | +| 29 | [`static.vegetation`](#7-staticvegetation-class-29) | Any vegetation in the frame that is higher than the ground, including bushes, plants, potted plants, trees, etc. Only tall grass (> 20cm) is part of this, ground level grass is part of `flat.terrain`.| +| 30 | [`static.other`](#8-staticother-class-30) | Points in the background that are not distinguishable. Or objects that do not match any of the above labels. | +| 31 | [`vehicle.ego`](#9-vehicleego-class-31) | The vehicle on which the cameras, radar and lidar are mounted, that is sometimes visible at the bottom of the image. | + +## Examples of classes +Below are examples of the classes added in nuScenes-lidarseg. +For simplicity, we only show lidar points which are relevant to the class being discussed. + + +### 1. noise (class 0) +![noise_1](https://www.nuscenes.org/public/images/taxonomy_imgs/lidarseg/0_scene-0053_CAM_FRONT_LEFT_1532402428104844_crop.jpg) +![noise_2](https://www.nuscenes.org/public/images/taxonomy_imgs/lidarseg/0_scene-0163_CAM_FRONT_LEFT_1526915289904917_crop.jpg) +![noise_3](https://www.nuscenes.org/public/images/taxonomy_imgs/lidarseg/0_scene-0207_CAM_BACK_LEFT_1532621922197405_crop.jpg) +![noise_4](https://www.nuscenes.org/public/images/taxonomy_imgs/lidarseg/0_scene-0635_CAM_FRONT_1537296086862404_crop.jpg) + +[Top](#classes) + + +### 2. flat.driveable_surface (class 24) +![driveable_surface_1](https://www.nuscenes.org/public/images/taxonomy_imgs/lidarseg/24_206_CAM_BACK.jpg) +![driveable_surface_2](https://www.nuscenes.org/public/images/taxonomy_imgs/lidarseg/24_250_CAM_FRONT.jpg) +![driveable_surface_3](https://www.nuscenes.org/public/images/taxonomy_imgs/lidarseg/24_9750_CAM_FRONT.jpg) +![driveable_surface_4](https://www.nuscenes.org/public/images/taxonomy_imgs/lidarseg/24_10000_CAM_BACK.jpg) + +[Top](#classes) + + +### 3. flat.sidewalk (class 25) +![sidewalk_1](https://www.nuscenes.org/public/images/taxonomy_imgs/lidarseg/25_90_CAM_FRONT_LEFT.jpg) +![sidewalk_2](https://www.nuscenes.org/public/images/taxonomy_imgs/lidarseg/25_13250_CAM_FRONT_LEFT.jpg) +![sidewalk_3](https://www.nuscenes.org/public/images/taxonomy_imgs/lidarseg/25_280_CAM_FRONT_LEFT.jpg) +![sidewalk_4](https://www.nuscenes.org/public/images/taxonomy_imgs/lidarseg/25_680_CAM_FRONT_LEFT.jpg) + +[Top](#classes) + + +### 4. flat.terrain (class 26) +![terrain_1](https://www.nuscenes.org/public/images/taxonomy_imgs/lidarseg/26_11750_CAM_BACK_RIGHT.jpg) +![terrain_2](https://www.nuscenes.org/public/images/taxonomy_imgs/lidarseg/26_10700_CAM_BACK_LEFT.jpg) +![terrain_2](https://www.nuscenes.org/public/images/taxonomy_imgs/lidarseg/26_886_CAM_BACK_LEFT.jpg) +![terrain_2](https://www.nuscenes.org/public/images/taxonomy_imgs/lidarseg/26_1260_CAM_BACK_LEFT.jpg) + +[Top](#classes) + + +### 5. flat.other (class 27) +![flat_other_1](https://www.nuscenes.org/public/images/taxonomy_imgs/lidarseg/27_2318_CAM_FRONT.jpg) +![flat_other_2](https://www.nuscenes.org/public/images/taxonomy_imgs/lidarseg/27_3750_CAM_FRONT_RIGHT.jpg) +![flat_other_3](https://www.nuscenes.org/public/images/taxonomy_imgs/lidarseg/27_1230_CAM_FRONT.jpg) +![flat_other_4](https://www.nuscenes.org/public/images/taxonomy_imgs/lidarseg/27_1380_CAM_FRONT.jpg) + +[Top](#classes) + + +### 6. static.manmade (class 28) +![manmade_1](https://www.nuscenes.org/public/images/taxonomy_imgs/lidarseg/28_13850_CAM_FRONT.jpg) +![manmade_2](https://www.nuscenes.org/public/images/taxonomy_imgs/lidarseg/28_15550_CAM_FRONT.jpg) +![manmade_3](https://www.nuscenes.org/public/images/taxonomy_imgs/lidarseg/28_5009_CAM_FRONT.jpg) +![manmade_4](https://www.nuscenes.org/public/images/taxonomy_imgs/lidarseg/28_5501_CAM_BACK.jpg) + +[Top](#classes) + + +### 7. static.vegetation (class 29) +![vegetation_1](https://www.nuscenes.org/public/images/taxonomy_imgs/lidarseg/29_650_CAM_FRONT_LEFT.jpg) +![vegetation_2](https://www.nuscenes.org/public/images/taxonomy_imgs/lidarseg/29_3650_CAM_FRONT.jpg) +![vegetation_3](https://www.nuscenes.org/public/images/taxonomy_imgs/lidarseg/29_5610_CAM_BACK_RIGHT.jpg) +![vegetation_4](https://www.nuscenes.org/public/images/taxonomy_imgs/lidarseg/29_5960_CAM_FRONT_RIGHT.jpg) + +[Top](#classes) + + +### 8. static.other (class 30) +![static_other_1](https://www.nuscenes.org/public/images/taxonomy_imgs/lidarseg/30_scene-0031_CAM_BACK_LEFT_1531886230947423.jpg) +![static_other_2](https://www.nuscenes.org/public/images/taxonomy_imgs/lidarseg/30_scene-0032_CAM_BACK_RIGHT_1531886262027893.jpg) +![static_other_3](https://www.nuscenes.org/public/images/taxonomy_imgs/lidarseg/30_scene-0160_CAM_BACK_LEFT_1533115303947423.jpg) +![static_other_4](https://www.nuscenes.org/public/images/taxonomy_imgs/lidarseg/30_scene-0166_CAM_BACK_RIGHT_1526915380527813.jpg) + +[Top](#classes) + + +### 9. vehicle.ego (class 31) +Points on the ego vehicle generally arise due to self-occlusion, in which some lidar beams hit the ego vehicle. +When the pointcloud is projected into a chosen camera image, the devkit removes points which are less than +1m in front of the camera to prevent such points from cluttering the image. Thus, users will not see points +belonging to `vehicle.ego` projected onto the camera images when using the devkit. To give examples, of the +`vehicle.ego` class, the bird's eye view (BEV) is used instead: + +![ego_1](https://www.nuscenes.org/public/images/taxonomy_imgs/lidarseg/31_479_BEV.jpg) +![ego_2](https://www.nuscenes.org/public/images/taxonomy_imgs/lidarseg/31_11200_BEV.jpg) +![ego_3](https://www.nuscenes.org/public/images/taxonomy_imgs/lidarseg/31_14500_BEV.jpg) +![ego_4](https://www.nuscenes.org/public/images/taxonomy_imgs/lidarseg/31_24230_BEV.jpg) + +[Top](#classes) diff --git a/docs/instructions_nuimages.md b/docs/instructions_nuimages.md new file mode 100644 index 00000000..ba45bbfc --- /dev/null +++ b/docs/instructions_nuimages.md @@ -0,0 +1,160 @@ +# nuImages Annotator Instructions + +# Overview +- [Introduction](#introduction) +- [Objects](#objects) + - [Bounding Boxes](#bounding-boxes) + - [Instance Segmentation](#instance-segmentation) + - [Attributes](#attributes) +- [Surfaces](#surfaces) + - [Semantic Segmentation](#semantic-segmentation) + +# Introduction +In nuImages, we annotate objects with 2d boxes, instance masks and 2d segmentation masks. All the labels and attributes from nuScenes are carried over into nuImages. +We have also [added more attributes](#attributes) in nuImages. For segmentation, we have included ["stuff" (background) classes](#surfaces). + +# Objects +nuImages contains the [same object classes as nuScenes](https://github.com/nutonomy/nuscenes-devkit/tree/master/docs/instructions_nuscenes.md#labels), +while the [attributes](#attributes) are a superset of the [attributes in nuScenes](https://github.com/nutonomy/nuscenes-devkit/tree/master/docs/instructions_nuscenes.md#attributes). + +## Bounding Boxes +### General Instructions + - Draw bounding boxes around all objects that are in the list of [object classes](https://github.com/nutonomy/nuscenes-devkit/tree/master/docs/instructions_nuscenes.md#labels). + - Do not apply more than one box to a single object. + - If an object is occluded, then draw the bounding box to include the occluded part of the object according to your best guess. + +![bboxes_occlusion_1](https://www.nuscenes.org/public/images/taxonomy_imgs/nuimages/correct-wrong/bboxes_occlusion_1.png) +![bboxes_occlusion_2](https://www.nuscenes.org/public/images/taxonomy_imgs/nuimages/correct-wrong/bboxes_occlusion_2.png) + - If an object is cut off at the edge of the image, then the bounding box should stop at the image boundary. + - If an object is reflected clearly in a glass window, then the reflection should be annotated. + +![bboxes_reflection](https://www.nuscenes.org/public/images/taxonomy_imgs/nuimages/correct-wrong/bboxes_reflection.png) + - If an object has extremities, the bounding box should include **all** the extremities (exceptions are the side view mirrors and antennas of vehicles). + Note that this differs [from how the instance masks are annotated](#instance-segmentation), in which the extremities are included in the masks. + +![bboxes_extremity_1](https://www.nuscenes.org/public/images/taxonomy_imgs/nuimages/correct-wrong/bboxes_extremity_1.png) +![bboxes_extremity_2](https://www.nuscenes.org/public/images/taxonomy_imgs/nuimages/correct-wrong/bboxes_extremity_2.png) + - Only label objects if the object is clear enough to be certain of what it is. If an object is so blurry it cannot be known, do not label the object. + - Do not label an object if its height is less than 10 pixels. + - Do not label an object if its less than 20% visible, unless you can confidently tell what the object is. + An object can have low visibility when it is occluded or cut off by the image. + The clarity and orientation of the object does not influence its visibility. + +### Detailed Instructions + - `human.pedestrian.*` + - In nighttime images, annotate the pedestrian only when either the body part(s) of a person is clearly visible (leg, arm, head etc.), or the person is clearly in motion. + +![bboxes_pedestrian_nighttime_fp_1](https://www.nuscenes.org/public/images/taxonomy_imgs/nuimages/correct-wrong/bboxes_pedestrian_nighttime_fp_1.png) +![bboxes_pedestrian_nighttime_fp_2](https://www.nuscenes.org/public/images/taxonomy_imgs/nuimages/correct-wrong/bboxes_pedestrian_nighttime_fp_2.png) + - `vehicle.*` + - In nighttime images, annotate a vehicle only when a pair of lights is clearly visible (break or head or hazard lights), and it is clearly on the road surface. + +![bboxes_vehicle_nighttime_fp_1](https://www.nuscenes.org/public/images/taxonomy_imgs/nuimages/correct-wrong/bboxes_vehicle_nighttime_fp_1.png) +![bboxes_vehicle_nighttime_fp_2](https://www.nuscenes.org/public/images/taxonomy_imgs/nuimages/correct-wrong/bboxes_vehicle_nighttime_fp_2.png) +![bboxes_vehicle_nighttime_fn_1](https://www.nuscenes.org/public/images/taxonomy_imgs/nuimages/correct-wrong/bboxes_vehicle_nighttime_fn_1.png) + +[Top](#overview) + +## Instance Segmentation +### General Instructions + - Given a bounding box, outline the **visible** parts of the object enclosed within the bounding box using a polygon. + - Each pixel on the image should be assigned to at most one object instance (i.e. the polygons should not overlap). + - There should not be a discrepancy of more than 2 pixels between the edge of the object instance and the polygon. + - If an object is occluded by another object whose width is less than 5 pixels (e.g. a thin fence), then the external object can be included in the polygon. + +![instanceseg_occlusion5pix_1](https://www.nuscenes.org/public/images/taxonomy_imgs/nuimages/correct-wrong/instanceseg_occlusion5pix_1.png) + - If an object is loosely covered by another object (e.g. branches, bushes), do not create several polygons for visible areas that are less than 15 pixels in diameter. + +![instanceseg_covered](https://www.nuscenes.org/public/images/taxonomy_imgs/nuimages/correct-wrong/instanceseg_covered.png) + - If an object enclosed by the bounding box is occluded by another foreground object but has a visible area through a glass window (like for cars / vans / trucks), + do not create a polygon on that visible area. + +![instanceseg_hole_another_object](https://www.nuscenes.org/public/images/taxonomy_imgs/nuimages/correct-wrong/instanceseg_hole_another_object.png) + - If an object has a visible area through a hole of another foreground object, create a polygon on the visible area. + Exemptions would be holes from bicycle / motorcycles / bike racks and holes that are less than 15 pixels diameter. + +![instanceseg_hole_another_object_exempt](https://www.nuscenes.org/public/images/taxonomy_imgs/nuimages/correct-wrong/instanceseg_hole_another_object_exempt.png) + - If a static / moveable object has another object attached to it (signboard, rope), include it in the annotation. + +![instanceseg_attached_object_1](https://www.nuscenes.org/public/images/taxonomy_imgs/nuimages/correct-wrong/instanceseg_attached_object_1.png) + - If parts of an object are not visible due to lighting and / or shadow, it is best to have an educated guess on the non-visible areas of the object. + +![instanceseg_guess](https://www.nuscenes.org/public/images/taxonomy_imgs/nuimages/correct-wrong/instanceseg_guess.png) + - If an object is reflected clearly in a glass window, then the reflection should be annotated. + +![instanceseg_reflection](https://www.nuscenes.org/public/images/taxonomy_imgs/nuimages/correct-wrong/instanceseg_reflection.png) + +### Detailed Instructions + - `vehicle.*` + - Include extremities (e.g. side view mirrors, taxi heads, police sirens, etc.); exceptions are the crane arms on construction vehicles. + +![instanceseg_extremity](https://www.nuscenes.org/public/images/taxonomy_imgs/nuimages/correct-wrong/instanceseg_extremity.png) +![instanceseg_extremity_exempt](https://www.nuscenes.org/public/images/taxonomy_imgs/nuimages/correct-wrong/instanceseg_extremity_exempt.png) + - `static_object.bicycle_rack` + - All bicycles in a bicycle rack should be annotated collectively as bicycle rack. + - **Note:** A previous version of this taxonomy did not include bicycle racks and therefore some images are missing bicycle rack annotations. We leave this class in the dataset, as it is merely an ignore label. The ignore label is used to avoid punishing false positives or false negatives on bicycle racks, where individual bicycles are difficult to identify. + +[Top](#overview) + +## Attributes +In nuImages, each object comes with a box, a mask and a set of attributes. +The following attributes are in **addition** to the [existing ones in nuScenes]((https://github.com/nutonomy/nuscenes-devkit/tree/master/docs/instructions_nuscenes.md#attributes)): + +| Attribute | Short Description | +| --- | --- | +| vehicle_light.emergency.flashing | The emergency lights on the vehicle are flashing. | +| vehicle_light.emergency.not_flashing | The emergency lights on the vehicle are not flashing. | +| vertical_position.off_ground | The object is not in the ground (e.g. it is flying, falling, jumping or positioned in a tree or on a vehicle). | +| vertical_position.on_ground | The object is on the ground plane. | + +[Top](#overview) + + +# Surfaces +nuImages includes surface classes as well: + +| Label | Short Description | +| --- | --- | +| [`flat.driveable_surface`](#1-flatdriveable_surface) | All paved or unpaved surfaces that a car can drive on with no concern of traffic rules. | +| [`vehicle.ego`](#2-vehicleego) | The vehicle on which the sensors are mounted, that are sometimes visible at the bottom of the image. | + +### 1. flat.driveable_surface +![driveable_1](https://www.nuscenes.org/public/images/taxonomy_imgs/nuimages/driveable_1.png) +![driveable_2](https://www.nuscenes.org/public/images/taxonomy_imgs/nuimages/driveable_2.png) +![driveable_3](https://www.nuscenes.org/public/images/taxonomy_imgs/nuimages/driveable_3.png) +![driveable_4](https://www.nuscenes.org/public/images/taxonomy_imgs/nuimages/driveable_4.png) + +### 2. vehicle.ego +![ego_1](https://www.nuscenes.org/public/images/taxonomy_imgs/nuimages/ego_1.png) +![ego_2](https://www.nuscenes.org/public/images/taxonomy_imgs/nuimages/ego_2.png) +![ego_3](https://www.nuscenes.org/public/images/taxonomy_imgs/nuimages/ego_3.png) +![ego_4](https://www.nuscenes.org/public/images/taxonomy_imgs/nuimages/ego_4.png) + +## Semantic Segmentation +### General Instructions + - Only annotate a surface if its length and width are **both** greater than 20 pixels. + - Annotations should tightly bound the edges of the area(s) of interest. + +![surface_no_gaps](https://www.nuscenes.org/public/images/taxonomy_imgs/nuimages/correct-wrong/surface_no_gaps.png) + - If two areas/objects of interest are adjacent to each other, there should be no gap between the two annotations. + +![surface_adjacent](https://www.nuscenes.org/public/images/taxonomy_imgs/nuimages/correct-wrong/surface_adjacent.png) + - Annotate a surface only as far as it is clearly visible. + +![surface_far_visible](https://www.nuscenes.org/public/images/taxonomy_imgs/nuimages/correct-wrong/surface_far_visible.png) + - If a surface is occluded (e.g. by branches, trees, fence poles), only annotate the visible areas (which are more than 20 pixels in length and width). + +![surface_occlusion_2](https://www.nuscenes.org/public/images/taxonomy_imgs/nuimages/correct-wrong/surface_occlusion_2.png) + - If a surface is covered by dirt or snow of less than 20 cm in height, include the dirt or snow in the annotation (since it can be safely driven over). + +![surface_snow](https://www.nuscenes.org/public/images/taxonomy_imgs/nuimages/correct-wrong/surface_snow.png) + - If a surface has puddles in it, always include them in the annotation. + - Do not annotate reflections of surfaces. + +### Detailed Instructions + - `flat.driveable_surface` + - Include surfaces blocked by road blockers or pillars as long as they are the same surface as the driveable surface. + +![surface_occlusion_1](https://www.nuscenes.org/public/images/taxonomy_imgs/nuimages/correct-wrong/surface_occlusion_1.png) + +[Top](#overview) diff --git a/instructions.md b/docs/instructions_nuscenes.md similarity index 100% rename from instructions.md rename to docs/instructions_nuscenes.md diff --git a/docs/schema_nuimages.md b/docs/schema_nuimages.md new file mode 100644 index 00000000..1ea5e546 --- /dev/null +++ b/docs/schema_nuimages.md @@ -0,0 +1,162 @@ +nuImages schema +========== +This document describes the database schema used in nuImages. +All annotations and meta data (including calibration, maps, vehicle coordinates etc.) are covered in a relational database. +The database tables are listed below. +Every row can be identified by its unique primary key `token`. +Foreign keys such as `sample_token` may be used to link to the `token` of the table `sample`. +Please refer to the [tutorial](https://www.nuscenes.org/nuimages#tutorials) for an introduction to the most important database tables. + +![](https://www.nuscenes.org/public/images/nuimages-schema.svg) + +attribute +--------- +An attribute is a property of an instance that can change while the category remains the same. +Example: a vehicle being parked/stopped/moving, and whether or not a bicycle has a rider. +The attributes in nuImages are a superset of those in nuScenes. +``` +attribute { + "token": -- Unique record identifier. + "name": -- Attribute name. + "description": -- Attribute description. +} +``` + +calibrated_sensor +--------- +Definition of a particular camera as calibrated on a particular vehicle. +All extrinsic parameters are given with respect to the ego vehicle body frame. +Contrary to nuScenes, all camera images come distorted and unrectified. +``` +calibrated_sensor { + "token": -- Unique record identifier. + "sensor_token": -- Foreign key pointing to the sensor type. + "translation": [3] -- Coordinate system origin in meters: x, y, z. + "rotation": [4] -- Coordinate system orientation as quaternion: w, x, y, z. + "camera_intrinsic": [3, 3] -- Intrinsic camera calibration. Empty for sensors that are not cameras. + "camera_distortion": [5 or 6] -- Camera calibration parameters. We use the 5 parameter camera convention of the CalTech camera calibration toolbox, that is also used in OpenCV. Only for fish-eye lenses in CAM_BACK do we use the 6th parameter. +} +``` + +category +--------- +Taxonomy of object categories (e.g. vehicle, human). +Subcategories are delineated by a period (e.g. `human.pedestrian.adult`). +The categories in nuImages are the same as in nuScenes (w/o lidarseg), plus `flat.driveable_surface`. +``` +category { + "token": -- Unique record identifier. + "name": -- Category name. Subcategories indicated by period. + "description": -- Category description. +} +``` + +ego_pose +--------- +Ego vehicle pose at a particular timestamp. Given with respect to global coordinate system of the log's map. +The ego_pose is the output of a lidar map-based localization algorithm described in our paper. +The localization is 2-dimensional in the x-y plane. +Warning: nuImages is collected from almost 500 logs with different maps versions. +Therefore the coordinates **should not be compared across logs** or rendered on the semantic maps of nuScenes. +``` +ego_pose { + "token": -- Unique record identifier. + "translation": [3] -- Coordinate system origin in meters: x, y, z. Note that z is always 0. + "rotation": [4] -- Coordinate system orientation as quaternion: w, x, y, z. + "timestamp": -- Unix time stamp. + "rotation_rate": [3] -- The angular velocity vector (x, y, z) of the vehicle in rad/s. This is expressed in the ego vehicle frame. + "acceleration": [3] -- Acceleration vector (x, y, z) in the ego vehicle frame in m/s/s. The z value is close to the gravitational acceleration `g = 9.81 m/s/s`. + "speed": -- The speed of the ego vehicle in the driving direction in m/s. +} +``` + +log +--------- +Information about the log from which the data was extracted. +``` +log { + "token": -- Unique record identifier. + "logfile": -- Log file name. + "vehicle": -- Vehicle name. + "date_captured": -- Date (YYYY-MM-DD). + "location": -- Area where log was captured, e.g. singapore-onenorth. +} +``` + +object_ann +--------- +The annotation of a foreground object (car, bike, pedestrian) in an image. +Each foreground object is annotated with a 2d box, a 2d instance mask and category-specific attributes. +``` +object_ann { + "token": -- Unique record identifier. + "sample_data_token": -- Foreign key pointing to the sample data, which must be a keyframe image. + "category_token": -- Foreign key pointing to the object category. + "attribute_tokens": [n] -- Foreign keys. List of attributes for this annotation. + "bbox": [4] -- Annotated amodal bounding box. Given as [xmin, ymin, xmax, ymax]. + "mask": -- Run length encoding of instance mask using the pycocotools package. +} +``` + +sample_data +--------- +Sample_data contains the images and information about when they were captured. +Sample_data covers all images, regardless of whether they are a keyframe or not. +Only keyframes are annotated. +For every keyframe, we also include up to 6 past and 6 future sweeps at 2 Hz. +We can navigate between consecutive images using the `prev` and `next` pointers. +The sample timestamp is inherited from the keyframe camera sample_data timestamp. +``` +sample_data { + "token": -- Unique record identifier. + "sample_token": -- Foreign key. Sample to which this sample_data is associated. + "ego_pose_token": -- Foreign key. + "calibrated_sensor_token": -- Foreign key. + "filename": -- Relative path to data-blob on disk. + "fileformat": -- Data file format. + "width": -- If the sample data is an image, this is the image width in pixels. + "height": -- If the sample data is an image, this is the image height in pixels. + "timestamp": -- Unix time stamp. + "is_key_frame": -- True if sample_data is part of key_frame, else False. + "next": -- Foreign key. Sample data from the same sensor that follows this in time. Empty if end of scene. + "prev": -- Foreign key. Sample data from the same sensor that precedes this in time. Empty if start of scene. +} +``` + +sample +--------- +A sample is an annotated keyframe selected from a large pool of images in a log. +Every sample has up to 13 camera sample_datas corresponding to it. +These include the keyframe, which can be accessed via `key_camera_token`. +``` +sample { + "token": -- Unique record identifier. + "timestamp": -- Unix time stamp. + "log_token": -- Foreign key pointing to the log. + "key_camera_token": -- Foreign key of the sample_data corresponding to the camera keyframe. +} +``` + +sensor +--------- +A specific sensor type. +``` +sensor { + "token": -- Unique record identifier. + "channel": -- Sensor channel name. + "modality": -- Sensor modality. Always "camera" in nuImages. +} +``` + +surface_ann +--------- +The annotation of a background object (driveable surface) in an image. +Each background object is annotated with a 2d semantic segmentation mask. +``` +surface_ann { + "token": -- Unique record identifier. + "sample_data_token": -- Foreign key pointing to the sample data, which must be a keyframe image. + "category_token": -- Foreign key pointing to the surface category. + "mask": -- Run length encoding of segmentation mask using the pycocotools package. +} +``` diff --git a/schema.md b/docs/schema_nuscenes.md similarity index 86% rename from schema.md rename to docs/schema_nuscenes.md index d687e41b..e69415ca 100644 --- a/schema.md +++ b/docs/schema_nuscenes.md @@ -1,30 +1,18 @@ -Database schema +nuScenes schema ========== This document describes the database schema used in nuScenes. All annotations and meta data (including calibration, maps, vehicle coordinates etc.) are covered in a relational database. The database tables are listed below. Every row can be identified by its unique primary key `token`. Foreign keys such as `sample_token` may be used to link to the `token` of the table `sample`. -Please refer to the [tutorial](https://www.nuscenes.org/tutorial) for an introduction to the most important database tables. +Please refer to the [tutorial](https://www.nuscenes.org/nuimages#tutorial) for an introduction to the most important database tables. +![](https://www.nuscenes.org/public/images/nuscenes-schema.svg) -category ---------- - -Taxonomy of object categories (e.g. vehicle, human). -Subcategories are delineated by a period (e.g. human.pedestrian.adult). -``` -category { - "token": -- Unique record identifier. - "name": -- Category name. Subcategories indicated by period. - "description": -- Category description. -} -``` attribute --------- - An attribute is a property of an instance that can change while the category remains the same. - Example: a vehicle being parked/stopped/moving, and whether or not a bicycle has a rider. +Example: a vehicle being parked/stopped/moving, and whether or not a bicycle has a rider. ``` attribute { "token": -- Unique record identifier. @@ -32,46 +20,9 @@ attribute { "description": -- Attribute description. } ``` -visibility ---------- - -The visibility of an instance is the fraction of annotation visible in all 6 images. Binned into 4 bins 0-40%, 40-60%, 60-80% and 80-100%. -``` -visibility { - "token": -- Unique record identifier. - "level": -- Visibility level. - "description": -- Description of visibility level. -} -``` -instance ---------- - -An object instance, e.g. particular vehicle. -This table is an enumeration of all object instances we observed. -Note that instances are not tracked across scenes. -``` -instance { - "token": -- Unique record identifier. - "category_token": -- Foreign key. Object instance category. - "nbr_annotations": -- Number of annotations of this instance. - "first_annotation_token": -- Foreign key. Points to the first annotation of this instance. - "last_annotation_token": -- Foreign key. Points to the last annotation of this instance. -} -``` -sensor ---------- -A specific sensor type. -``` -sensor { - "token": -- Unique record identifier. - "channel": -- Sensor channel name. - "modality": {camera, lidar, radar} -- Sensor modality. Supports category(ies) in brackets. -} -``` calibrated_sensor --------- - Definition of a particular sensor (lidar/radar/camera) as calibrated on a particular vehicle. All extrinsic parameters are given with respect to the ego vehicle body frame. All camera images come undistorted and rectified. @@ -84,9 +35,22 @@ calibrated_sensor { "camera_intrinsic": [3, 3] -- Intrinsic camera calibration. Empty for sensors that are not cameras. } ``` -ego_pose + +category --------- +Taxonomy of object categories (e.g. vehicle, human). +Subcategories are delineated by a period (e.g. `human.pedestrian.adult`). +``` +category { + "token": -- Unique record identifier. + "name": -- Category name. Subcategories indicated by period. + "description": -- Category description. + "index": -- The index of the label used for efficiency reasons in the .bin label files of nuScenes-lidarseg. This field did not exist previously. +} +``` +ego_pose +--------- Ego vehicle pose at a particular timestamp. Given with respect to global coordinate system of the log's map. The ego_pose is the output of a lidar map-based localization algorithm described in our paper. The localization is 2-dimensional in the x-y plane. @@ -98,9 +62,35 @@ ego_pose { "timestamp": -- Unix time stamp. } ``` -log + +instance +--------- +An object instance, e.g. particular vehicle. +This table is an enumeration of all object instances we observed. +Note that instances are not tracked across scenes. +``` +instance { + "token": -- Unique record identifier. + "category_token": -- Foreign key pointing to the object category. + "nbr_annotations": -- Number of annotations of this instance. + "first_annotation_token": -- Foreign key. Points to the first annotation of this instance. + "last_annotation_token": -- Foreign key. Points to the last annotation of this instance. +} +``` + +lidarseg --------- +Mapping between nuScenes-lidarseg annotations and sample_datas corresponding to the lidar pointcloud associated with a keyframe. +``` +lidarseg { + "token": -- Unique record identifier. + "filename": -- The name of the .bin files containing the nuScenes-lidarseg labels. These are numpy arrays of uint8 stored in binary format using numpy. + "sample_data_token": -- Foreign key. Sample_data corresponding to the annotated lidar pointcloud with is_key_frame=True. +} +``` +log +--------- Information about the log from which the data was extracted. ``` log { @@ -111,27 +101,23 @@ log { "location": -- Area where log was captured, e.g. singapore-onenorth. } ``` -scene ---------- -A scene is a 20s long sequence of consecutive frames extracted from a log. -Multiple scenes can come from the same log. -Note that object identities (instance tokens) are not preserved across scenes. +map +--------- +Map data that is stored as binary semantic masks from a top-down view. ``` -scene { +map { "token": -- Unique record identifier. - "name": -- Short string identifier. - "description": -- Longer description of the scene. - "log_token": -- Foreign key. Points to log from where the data was extracted. - "nbr_samples": -- Number of samples in this scene. - "first_sample_token": -- Foreign key. Points to the first sample in scene. - "last_sample_token": -- Foreign key. Points to the last sample in scene. + "log_tokens": [n] -- Foreign keys. + "category": -- Map category, currently only semantic_prior for drivable surface and sidewalk. + "filename": -- Relative path to the file with the map mask. } ``` + sample --------- - -A sample is data collected at (approximately) the same timestamp as part of a single LIDAR sweep. +A sample is an annotated keyframe at 2 Hz. +The data is collected at (approximately) the same timestamp as part of a single LIDAR sweep. ``` sample { "token": -- Unique record identifier. @@ -141,9 +127,30 @@ sample { "prev": -- Foreign key. Sample that precedes this in time. Empty if start of scene. } ``` -sample_data + +sample_annotation --------- +A bounding box defining the position of an object seen in a sample. +All location data is given with respect to the global coordinate system. +``` +sample_annotation { + "token": -- Unique record identifier. + "sample_token": -- Foreign key. NOTE: this points to a sample NOT a sample_data since annotations are done on the sample level taking all relevant sample_data into account. + "instance_token": -- Foreign key. Which object instance is this annotating. An instance can have multiple annotations over time. + "attribute_tokens": [n] -- Foreign keys. List of attributes for this annotation. Attributes can change over time, so they belong here, not in the instance table. + "visibility_token": -- Foreign key. Visibility may also change over time. If no visibility is annotated, the token is an empty string. + "translation": [3] -- Bounding box location in meters as center_x, center_y, center_z. + "size": [3] -- Bounding box size in meters as width, length, height. + "rotation": [4] -- Bounding box orientation as quaternion: w, x, y, z. + "num_lidar_pts": -- Number of lidar points in this box. Points are counted during the lidar sweep identified with this sample. + "num_radar_pts": -- Number of radar points in this box. Points are counted during the radar sweep identified with this sample. This number is summed across all radar sensors without any invalid point filtering. + "next": -- Foreign key. Sample annotation from the same object instance that follows this in time. Empty if this is the last annotation for this object. + "prev": -- Foreign key. Sample annotation from the same object instance that precedes this in time. Empty if this is the first annotation for this object. +} +``` +sample_data +--------- A sensor data e.g. image, point cloud or radar return. For sample_data with is_key_frame=True, the time-stamps should be very close to the sample it points to. For non key-frames the sample_data points to the sample that follows closest in time. @@ -163,36 +170,42 @@ sample_data { "prev": -- Foreign key. Sample data from the same sensor that precedes this in time. Empty if start of scene. } ``` -sample_annotation ---------- -A bounding box defining the position of an object seen in a sample. -All location data is given with respect to the global coordinate system. +scene +--------- +A scene is a 20s long sequence of consecutive frames extracted from a log. +Multiple scenes can come from the same log. +Note that object identities (instance tokens) are not preserved across scenes. ``` -sample_annotation { +scene { "token": -- Unique record identifier. - "sample_token": -- Foreign key. NOTE: this points to a sample NOT a sample_data since annotations are done on the sample level taking all relevant sample_data into account. - "instance_token": -- Foreign key. Which object instance is this annotating. An instance can have multiple annotations over time. - "attribute_tokens": [n] -- Foreign keys. List of attributes for this annotation. Attributes can change over time, so they belong here, not in the object table. - "visibility_token": -- Foreign key. Visibility may also change over time. If no visibility is annotated, the token is an empty string. - "translation": [3] -- Bounding box location in meters as center_x, center_y, center_z. - "size": [3] -- Bounding box size in meters as width, length, height. - "rotation": [4] -- Bounding box orientation as quaternion: w, x, y, z. - "num_lidar_pts": -- Number of lidar points in this box. Points are counted during the lidar sweep identified with this sample. - "num_radar_pts": -- Number of radar points in this box. Points are counted during the radar sweep identified with this sample. This number is summed across all radar sensors without any invalid point filtering. - "next": -- Foreign key. Sample annotation from the same object instance that follows this in time. Empty if this is the last annotation for this object. - "prev": -- Foreign key. Sample annotation from the same object instance that precedes this in time. Empty if this is the first annotation for this object. + "name": -- Short string identifier. + "description": -- Longer description of the scene. + "log_token": -- Foreign key. Points to log from where the data was extracted. + "nbr_samples": -- Number of samples in this scene. + "first_sample_token": -- Foreign key. Points to the first sample in scene. + "last_sample_token": -- Foreign key. Points to the last sample in scene. } ``` -map + +sensor --------- +A specific sensor type. +``` +sensor { + "token": -- Unique record identifier. + "channel": -- Sensor channel name. + "modality": {camera, lidar, radar} -- Sensor modality. Supports category(ies) in brackets. +} +``` -Map data that is stored as binary semantic masks from a top-down view. +visibility +--------- +The visibility of an instance is the fraction of annotation visible in all 6 images. Binned into 4 bins 0-40%, 40-60%, 60-80% and 80-100%. ``` -map { +visibility { "token": -- Unique record identifier. - "log_tokens": [n] -- Foreign keys. - "category": -- Map category, currently only semantic_prior for drivable surface and sidewalk. - "filename": -- Relative path to the file with the map mask. + "level": -- Visibility level. + "description": -- Description of visibility level. } ``` diff --git a/python-sdk/nuimages/__init__.py b/python-sdk/nuimages/__init__.py new file mode 100644 index 00000000..0010d970 --- /dev/null +++ b/python-sdk/nuimages/__init__.py @@ -0,0 +1 @@ +from .nuimages import NuImages diff --git a/python-sdk/nuimages/export/export_release.py b/python-sdk/nuimages/export/export_release.py new file mode 100644 index 00000000..c246adae --- /dev/null +++ b/python-sdk/nuimages/export/export_release.py @@ -0,0 +1,66 @@ +# nuScenes dev-kit. +# Code written by Holger Caesar, 2020. + +import fire +import os +import json +import tarfile +from typing import List + + +def export_release(dataroot='/data/sets/nuimages', version: str = 'v1.0') -> None: + """ + This script tars the image and metadata files for release on https://www.nuscenes.org/download. + :param dataroot: The nuImages folder. + :param version: The nuImages dataset version. + """ + # Create export folder. + export_dir = os.path.join(dataroot, 'export') + if not os.path.isdir(export_dir): + os.makedirs(export_dir) + + # Determine the images from the mini split. + mini_src = os.path.join(dataroot, version + '-mini') + with open(os.path.join(mini_src, 'sample_data.json'), 'r') as f: + sample_data = json.load(f) + file_names = [sd['filename'] for sd in sample_data] + + # Hard-code the mapping from archive names to their relative folder paths. + archives = { + 'all-metadata': [version + '-train', version + '-val', version + '-test', version + '-mini'], + 'all-samples': ['samples'], + 'all-sweeps-cam-back': ['sweeps/CAM_BACK'], + 'all-sweeps-cam-back-left': ['sweeps/CAM_BACK_LEFT'], + 'all-sweeps-cam-back-right': ['sweeps/CAM_BACK_RIGHT'], + 'all-sweeps-cam-front': ['sweeps/CAM_FRONT'], + 'all-sweeps-cam-front-left': ['sweeps/CAM_FRONT_LEFT'], + 'all-sweeps-cam-front-right': ['sweeps/CAM_FRONT_RIGHT'], + 'mini': [version + '-mini'] + file_names + } + + # Pack each folder. + for key, folder_list in archives.items(): + out_path = os.path.join(export_dir, 'nuimages-%s-%s.tgz' % (version, key)) + if os.path.exists(out_path): + print('Warning: Skipping export for file as it already exists: %s' % out_path) + continue + print('Compressing archive %s...' % out_path) + pack_folder(out_path, dataroot, folder_list) + + +def pack_folder(out_path: str, dataroot: str, folder_list: List[str], tar_format: str = 'w:gz') -> None: + """ + :param out_path: The output path where we write the tar file. + :param dataroot: The nuImages folder. + :param folder_list: List of files or folders to include in the archive. + :param tar_format: The compression format to use. See tarfile package for more options. + """ + tar = tarfile.open(out_path, tar_format) + for name in folder_list: + folder_path = os.path.join(dataroot, name) + tar.add(folder_path, arcname=name) + tar.close() + + +if __name__ == '__main__': + fire.Fire(export_release) diff --git a/python-sdk/nuimages/nuimages.py b/python-sdk/nuimages/nuimages.py new file mode 100644 index 00000000..3a660cd6 --- /dev/null +++ b/python-sdk/nuimages/nuimages.py @@ -0,0 +1,774 @@ +# nuScenes dev-kit. +# Code written by Asha Asvathaman & Holger Caesar, 2020. + +import json +import os.path as osp +import sys +import time +from collections import defaultdict +from typing import Any, List, Dict, Optional, Tuple, Callable + +import matplotlib.pyplot as plt +import numpy as np +from PIL import Image, ImageDraw +from pyquaternion import Quaternion + +from nuimages.utils.utils import annotation_name, mask_decode, get_font, name_to_index_mapping +from nuscenes.utils.color_map import get_colormap + +PYTHON_VERSION = sys.version_info[0] + +if not PYTHON_VERSION == 3: + raise ValueError("nuScenes dev-kit only supports Python version 3.") + + +class NuImages: + """ + Database class for nuImages to help query and retrieve information from the database. + """ + + def __init__(self, + version: str = 'v1.0-mini', + dataroot: str = '/data/sets/nuimages', + lazy: bool = True, + verbose: bool = False): + """ + Loads database and creates reverse indexes and shortcuts. + :param version: Version to load (e.g. "v1.0-train", "v1.0-val", "v1.0-test", "v1.0-mini"). + :param dataroot: Path to the tables and data. + :param lazy: Whether to use lazy loading for the database tables. + :param verbose: Whether to print status messages during load. + """ + self.version = version + self.dataroot = dataroot + self.lazy = lazy + self.verbose = verbose + + self.table_names = ['attribute', 'calibrated_sensor', 'category', 'ego_pose', 'log', 'object_ann', 'sample', + 'sample_data', 'sensor', 'surface_ann'] + + assert osp.exists(self.table_root), 'Database version not found: {}'.format(self.table_root) + + start_time = time.time() + if verbose: + print("======\nLoading nuImages tables for version {}...".format(self.version)) + + # Init reverse indexing. + self._token2ind: Dict[str, Optional[dict]] = dict() + for table in self.table_names: + self._token2ind[table] = None + + # Load tables directly if requested. + if not self.lazy: + # Explicitly init tables to help the IDE determine valid class members. + self.attribute = self.__load_table__('attribute') + self.calibrated_sensor = self.__load_table__('calibrated_sensor') + self.category = self.__load_table__('category') + self.ego_pose = self.__load_table__('ego_pose') + self.log = self.__load_table__('log') + self.object_ann = self.__load_table__('object_ann') + self.sample = self.__load_table__('sample') + self.sample_data = self.__load_table__('sample_data') + self.sensor = self.__load_table__('sensor') + self.surface_ann = self.__load_table__('surface_ann') + + self.color_map = get_colormap() + + if verbose: + print("Done loading in {:.3f} seconds (lazy={}).\n======".format(time.time() - start_time, self.lazy)) + + # ### Internal methods. ### + + def __getattr__(self, attr_name: str) -> Any: + """ + Implement lazy loading for the database tables. Otherwise throw the default error. + :param attr_name: The name of the variable to look for. + :return: The dictionary that represents that table. + """ + if attr_name in self.table_names: + return self._load_lazy(attr_name, lambda tab_name: self.__load_table__(tab_name)) + else: + raise AttributeError("Error: %r object has no attribute %r" % (self.__class__.__name__, attr_name)) + + def get(self, table_name: str, token: str) -> dict: + """ + Returns a record from table in constant runtime. + :param table_name: Table name. + :param token: Token of the record. + :return: Table record. See README.md for record details for each table. + """ + assert table_name in self.table_names, "Table {} not found".format(table_name) + + return getattr(self, table_name)[self.getind(table_name, token)] + + def getind(self, table_name: str, token: str) -> int: + """ + This returns the index of the record in a table in constant runtime. + :param table_name: Table name. + :param token: Token of the record. + :return: The index of the record in table, table is an array. + """ + # Lazy loading: Compute reverse indices. + if self._token2ind[table_name] is None: + self._token2ind[table_name] = dict() + for ind, member in enumerate(getattr(self, table_name)): + self._token2ind[table_name][member['token']] = ind + + return self._token2ind[table_name][token] + + @property + def table_root(self) -> str: + """ + Returns the folder where the tables are stored for the relevant version. + """ + return osp.join(self.dataroot, self.version) + + def load_tables(self, table_names: List[str]) -> None: + """ + Load tables and add them to self, if not already loaded. + :param table_names: The names of the nuImages tables to be loaded. + """ + for table_name in table_names: + self._load_lazy(table_name, lambda tab_name: self.__load_table__(tab_name)) + + def _load_lazy(self, attr_name: str, loading_func: Callable) -> Any: + """ + Load an attribute and add it to self, if it isn't already loaded. + :param attr_name: The name of the attribute to be loaded. + :param loading_func: The function used to load it if necessary. + :return: The loaded attribute. + """ + if attr_name in self.__dict__.keys(): + return self.__getattribute__(attr_name) + else: + attr = loading_func(attr_name) + self.__setattr__(attr_name, attr) + return attr + + def __load_table__(self, table_name) -> List[dict]: + """ + Load a table and return it. + :param table_name: The name of the table to load. + :return: The table dictionary. + """ + start_time = time.time() + table_path = osp.join(self.table_root, '{}.json'.format(table_name)) + assert osp.exists(table_path), 'Error: Table %s does not exist!' % table_name + with open(table_path) as f: + table = json.load(f) + end_time = time.time() + + # Print a message to stdout. + if self.verbose: + print("Loaded {} {}(s) in {:.3f}s,".format(len(table), table_name, end_time - start_time)) + + return table + + def shortcut(self, src_table: str, tgt_table: str, src_token: str) -> Dict[str, Any]: + """ + Convenience function to navigate between different tables that have one-to-one relations. + E.g. we can use this function to conveniently retrieve the sensor for a sample_data. + :param src_table: The name of the source table. + :param tgt_table: The name of the target table. + :param src_token: The source token. + :return: The entry of the destination table corresponding to the source token. + """ + if src_table == 'sample_data' and tgt_table == 'sensor': + sample_data = self.get('sample_data', src_token) + calibrated_sensor = self.get('calibrated_sensor', sample_data['calibrated_sensor_token']) + sensor = self.get('sensor', calibrated_sensor['sensor_token']) + + return sensor + elif (src_table == 'object_ann' or src_table == 'surface_ann') and tgt_table == 'sample': + src = self.get(src_table, src_token) + sample_data = self.get('sample_data', src['sample_data_token']) + sample = self.get('sample', sample_data['sample_token']) + + return sample + else: + raise Exception('Error: Shortcut from %s to %s not implemented!' % (src_table, tgt_table)) + + def check_sweeps(self, filename: str) -> None: + """ + Check that the sweeps folder was downloaded if required. + :param filename: The filename of the sample_data. + """ + assert filename.startswith('samples') or filename.startswith('sweeps'), \ + 'Error: You passed an incorrect filename to check_sweeps(). Please use sample_data[''filename''].' + + if 'sweeps' in filename: + sweeps_dir = osp.join(self.dataroot, 'sweeps') + if not osp.isdir(sweeps_dir): + raise Exception('Error: You are missing the "%s" directory! The devkit generally works without this ' + 'directory, but you cannot call methods that use non-keyframe sample_datas.' + % sweeps_dir) + + # ### List methods. ### + + def list_attributes(self, sort_by: str = 'freq') -> None: + """ + List all attributes and the number of annotations with each attribute. + :param sort_by: Sorting criteria, e.g. "name", "freq". + """ + # Preload data if in lazy load to avoid confusing outputs. + if self.lazy: + self.load_tables(['attribute', 'object_ann']) + + # Count attributes. + attribute_freqs = defaultdict(lambda: 0) + for object_ann in self.object_ann: + for attribute_token in object_ann['attribute_tokens']: + attribute_freqs[attribute_token] += 1 + + # Sort entries. + if sort_by == 'name': + sort_order = [i for (i, _) in sorted(enumerate(self.attribute), key=lambda x: x[1]['name'])] + elif sort_by == 'freq': + attribute_freqs_order = [attribute_freqs[c['token']] for c in self.attribute] + sort_order = [i for (i, _) in + sorted(enumerate(attribute_freqs_order), key=lambda x: x[1], reverse=True)] + else: + raise Exception('Error: Invalid sorting criterion %s!' % sort_by) + + # Print to stdout. + format_str = '{:11} {:24.24} {:48.48}' + print() + print(format_str.format('Annotations', 'Name', 'Description')) + for s in sort_order: + attribute = self.attribute[s] + print(format_str.format( + attribute_freqs[attribute['token']], attribute['name'], attribute['description'])) + + def list_cameras(self) -> None: + """ + List all cameras and the number of samples for each. + """ + # Preload data if in lazy load to avoid confusing outputs. + if self.lazy: + self.load_tables(['sample', 'sample_data', 'calibrated_sensor', 'sensor']) + + # Count cameras. + cs_freqs = defaultdict(lambda: 0) + channel_freqs = defaultdict(lambda: 0) + for calibrated_sensor in self.calibrated_sensor: + sensor = self.get('sensor', calibrated_sensor['sensor_token']) + cs_freqs[sensor['channel']] += 1 + for sample_data in self.sample_data: + if sample_data['is_key_frame']: # Only use keyframes (samples). + sensor = self.shortcut('sample_data', 'sensor', sample_data['token']) + channel_freqs[sensor['channel']] += 1 + + # Print to stdout. + format_str = '{:15} {:7} {:25}' + print() + print(format_str.format('Calibr. sensors', 'Samples', 'Channel')) + for channel in cs_freqs.keys(): + cs_freq = cs_freqs[channel] + channel_freq = channel_freqs[channel] + print(format_str.format( + cs_freq, channel_freq, channel)) + + def list_categories(self, sample_tokens: List[str] = None, sort_by: str = 'object_freq') -> None: + """ + List all categories and the number of object_anns and surface_anns for them. + :param sample_tokens: A list of sample tokens for which category stats will be shown. + :param sort_by: Sorting criteria, e.g. "name", "object_freq", "surface_freq". + """ + # Preload data if in lazy load to avoid confusing outputs. + if self.lazy: + self.load_tables(['sample', 'object_ann', 'surface_ann', 'category']) + + # Count object_anns and surface_anns. + object_freqs = defaultdict(lambda: 0) + surface_freqs = defaultdict(lambda: 0) + if sample_tokens is not None: + sample_tokens = set(sample_tokens) + + for object_ann in self.object_ann: + sample = self.shortcut('object_ann', 'sample', object_ann['token']) + if sample_tokens is None or sample['token'] in sample_tokens: + object_freqs[object_ann['category_token']] += 1 + + for surface_ann in self.surface_ann: + sample = self.shortcut('surface_ann', 'sample', surface_ann['token']) + if sample_tokens is None or sample['token'] in sample_tokens: + surface_freqs[surface_ann['category_token']] += 1 + + # Sort entries. + if sort_by == 'name': + sort_order = [i for (i, _) in sorted(enumerate(self.category), key=lambda x: x[1]['name'])] + elif sort_by == 'object_freq': + object_freqs_order = [object_freqs[c['token']] for c in self.category] + sort_order = [i for (i, _) in sorted(enumerate(object_freqs_order), key=lambda x: x[1], reverse=True)] + elif sort_by == 'surface_freq': + surface_freqs_order = [surface_freqs[c['token']] for c in self.category] + sort_order = [i for (i, _) in sorted(enumerate(surface_freqs_order), key=lambda x: x[1], reverse=True)] + else: + raise Exception('Error: Invalid sorting criterion %s!' % sort_by) + + # Print to stdout. + format_str = '{:11} {:12} {:24.24} {:48.48}' + print() + print(format_str.format('Object_anns', 'Surface_anns', 'Name', 'Description')) + for s in sort_order: + category = self.category[s] + category_token = category['token'] + object_freq = object_freqs[category_token] + surface_freq = surface_freqs[category_token] + + # Skip empty categories. + if object_freq == 0 and surface_freq == 0: + continue + + name = category['name'] + description = category['description'] + print(format_str.format( + object_freq, surface_freq, name, description)) + + def list_anns(self, sample_token: str, verbose: bool = True) -> Tuple[List[str], List[str]]: + """ + List all the annotations of a sample. + :param sample_token: Sample token. + :param verbose: Whether to print to stdout. + :return: The object and surface annotation tokens in this sample. + """ + # Preload data if in lazy load to avoid confusing outputs. + if self.lazy: + self.load_tables(['sample', 'object_ann', 'surface_ann', 'category']) + + sample = self.get('sample', sample_token) + key_camera_token = sample['key_camera_token'] + object_anns = [o for o in self.object_ann if o['sample_data_token'] == key_camera_token] + surface_anns = [o for o in self.surface_ann if o['sample_data_token'] == key_camera_token] + + if verbose: + print('Printing object annotations:') + for object_ann in object_anns: + category = self.get('category', object_ann['category_token']) + attribute_names = [self.get('attribute', at)['name'] for at in object_ann['attribute_tokens']] + print('{} {} {}'.format(object_ann['token'], category['name'], attribute_names)) + + print('\nPrinting surface annotations:') + for surface_ann in surface_anns: + category = self.get('category', surface_ann['category_token']) + print(surface_ann['token'], category['name']) + + object_tokens = [o['token'] for o in object_anns] + surface_tokens = [s['token'] for s in surface_anns] + return object_tokens, surface_tokens + + def list_logs(self) -> None: + """ + List all logs and the number of samples per log. + """ + # Preload data if in lazy load to avoid confusing outputs. + if self.lazy: + self.load_tables(['sample', 'log']) + + # Count samples. + sample_freqs = defaultdict(lambda: 0) + for sample in self.sample: + sample_freqs[sample['log_token']] += 1 + + # Print to stdout. + format_str = '{:6} {:29} {:24}' + print() + print(format_str.format('Samples', 'Log', 'Location')) + for log in self.log: + sample_freq = sample_freqs[log['token']] + logfile = log['logfile'] + location = log['location'] + print(format_str.format( + sample_freq, logfile, location)) + + def list_sample_content(self, sample_token: str) -> None: + """ + List the sample_datas for a given sample. + :param sample_token: Sample token. + """ + # Preload data if in lazy load to avoid confusing outputs. + if self.lazy: + self.load_tables(['sample', 'sample_data']) + + # Print content for each modality. + sample = self.get('sample', sample_token) + sample_data_tokens = self.get_sample_content(sample_token) + timestamps = np.array([self.get('sample_data', sd_token)['timestamp'] for sd_token in sample_data_tokens]) + rel_times = (timestamps - sample['timestamp']) / 1e6 + + print('\nListing sample content...') + print('Rel. time\tSample_data token') + for rel_time, sample_data_token in zip(rel_times, sample_data_tokens): + print('{:>9.1f}\t{}'.format(rel_time, sample_data_token)) + + def list_sample_data_histogram(self) -> None: + """ + Show a histogram of the number of sample_datas per sample. + """ + # Preload data if in lazy load to avoid confusing outputs. + if self.lazy: + self.load_tables(['sample_data']) + + # Count sample_datas for each sample. + sample_counts = defaultdict(lambda: 0) + for sample_data in self.sample_data: + sample_counts[sample_data['sample_token']] += 1 + + # Compute histogram. + sample_counts_list = np.array(list(sample_counts.values())) + bin_range = np.max(sample_counts_list) - np.min(sample_counts_list) + if bin_range == 0: + values = [len(sample_counts_list)] + freqs = [sample_counts_list[0]] + else: + values, bins = np.histogram(sample_counts_list, bin_range) + freqs = bins[1:] # To get the frequency we need to use the right side of the bin. + + # Print statistics. + print('\nListing sample_data frequencies..') + print('# images\t# samples') + for freq, val in zip(freqs, values): + print('{:>8d}\t{:d}'.format(int(freq), int(val))) + + # ### Getter methods. ### + + def get_sample_content(self, + sample_token: str) -> List[str]: + """ + For a given sample, return all the sample_datas in chronological order. + :param sample_token: Sample token. + :return: A list of sample_data tokens sorted by their timestamp. + """ + sample = self.get('sample', sample_token) + key_sd = self.get('sample_data', sample['key_camera_token']) + + # Go forward. + cur_sd = key_sd + forward = [] + while cur_sd['next'] != '': + cur_sd = self.get('sample_data', cur_sd['next']) + forward.append(cur_sd['token']) + + # Go backward. + cur_sd = key_sd + backward = [] + while cur_sd['prev'] != '': + cur_sd = self.get('sample_data', cur_sd['prev']) + backward.append(cur_sd['token']) + + # Combine. + result = backward[::-1] + [key_sd['token']] + forward + + return result + + def get_ego_pose_data(self, + sample_token: str, + attribute_name: str = 'translation') -> Tuple[np.ndarray, np.ndarray]: + """ + Return the ego pose data of the <= 13 sample_datas associated with this sample. + The method return translation, rotation, rotation_rate, acceleration and speed. + :param sample_token: Sample token. + :param attribute_name: The ego_pose field to extract, e.g. "translation", "acceleration" or "speed". + :return: ( + timestamps: The timestamp of each ego_pose. + attributes: A matrix with sample_datas x len(attribute) number of fields. + ) + """ + assert attribute_name in ['translation', 'rotation', 'rotation_rate', 'acceleration', 'speed'], \ + 'Error: The attribute_name %s is not a valid option!' % attribute_name + + if attribute_name == 'speed': + attribute_len = 1 + elif attribute_name == 'rotation': + attribute_len = 4 + else: + attribute_len = 3 + + sd_tokens = self.get_sample_content(sample_token) + attributes = np.zeros((len(sd_tokens), attribute_len)) + timestamps = np.zeros((len(sd_tokens))) + for i, sd_token in enumerate(sd_tokens): + # Get attribute. + sample_data = self.get('sample_data', sd_token) + ego_pose = self.get('ego_pose', sample_data['ego_pose_token']) + attribute = ego_pose[attribute_name] + + # Store results. + attributes[i, :] = attribute + timestamps[i] = ego_pose['timestamp'] + + return timestamps, attributes + + def get_trajectory(self, + sample_token: str, + rotation_yaw: float = 0.0, + center_key_pose: bool = True) -> Tuple[np.ndarray, int]: + """ + Get the trajectory of the ego vehicle and optionally rotate and center it. + :param sample_token: Sample token. + :param rotation_yaw: Rotation of the ego vehicle in the plot. + Set to None to use lat/lon coordinates. + Set to 0 to point in the driving direction at the time of the keyframe. + Set to any other value to rotate relative to the driving direction (in radians). + :param center_key_pose: Whether to center the trajectory on the key pose. + :return: ( + translations: A matrix with sample_datas x 3 values of the translations at each timestamp. + key_index: The index of the translations corresponding to the keyframe (usually 6). + ) + """ + # Get trajectory data. + timestamps, translations = self.get_ego_pose_data(sample_token) + + # Find keyframe translation and rotation. + sample = self.get('sample', sample_token) + sample_data = self.get('sample_data', sample['key_camera_token']) + ego_pose = self.get('ego_pose', sample_data['ego_pose_token']) + key_rotation = Quaternion(ego_pose['rotation']) + key_timestamp = ego_pose['timestamp'] + key_index = [i for i, t in enumerate(timestamps) if t == key_timestamp][0] + + # Rotate points such that the initial driving direction points upwards. + if rotation_yaw is not None: + rotation = key_rotation.inverse * Quaternion(axis=[0, 0, 1], angle=np.pi / 2 - rotation_yaw) + translations = np.dot(rotation.rotation_matrix, translations.T).T + + # Subtract origin to have lower numbers on the axes. + if center_key_pose: + translations -= translations[key_index, :] + + return translations, key_index + + def get_segmentation(self, + sd_token: str) -> Tuple[np.ndarray, np.ndarray]: + """ + Produces two segmentation masks as numpy arrays of size H x W each, where H and W are the height and width + of the camera image respectively: + - semantic mask: A mask in which each pixel is an integer value between 0 to C (inclusive), + where C is the number of categories in nuImages. Each integer corresponds to + the index of the class in the category.json. + - instance mask: A mask in which each pixel is an integer value between 0 to N, where N is the + number of objects in a given camera sample_data. Each integer corresponds to + the order in which the object was drawn into the mask. + :param sd_token: The token of the sample_data to be rendered. + :return: Two 2D numpy arrays (one semantic mask , and one instance mask ). + """ + # Validate inputs. + sample_data = self.get('sample_data', sd_token) + assert sample_data['is_key_frame'], 'Error: Cannot render annotations for non keyframes!' + + name_to_index = name_to_index_mapping(self.category) + + # Get image data. + self.check_sweeps(sample_data['filename']) + im_path = osp.join(self.dataroot, sample_data['filename']) + im = Image.open(im_path) + + (width, height) = im.size + semseg_mask = np.zeros((height, width)).astype('int32') + instanceseg_mask = np.zeros((height, width)).astype('int32') + + # Load stuff / surface regions. + surface_anns = [o for o in self.surface_ann if o['sample_data_token'] == sd_token] + + # Draw stuff / surface regions. + for ann in surface_anns: + # Get color and mask. + category_token = ann['category_token'] + category_name = self.get('category', category_token)['name'] + if ann['mask'] is None: + continue + mask = mask_decode(ann['mask']) + + # Draw mask for semantic segmentation. + semseg_mask[mask == 1] = name_to_index[category_name] + + # Load object instances. + object_anns = [o for o in self.object_ann if o['sample_data_token'] == sd_token] + + # Sort by token to ensure that objects always appear in the instance mask in the same order. + object_anns = sorted(object_anns, key=lambda k: k['token']) + + # Draw object instances. + # The 0 index is reserved for background; thus, the instances should start from index 1. + for i, ann in enumerate(object_anns, start=1): + # Get color, box, mask and name. + category_token = ann['category_token'] + category_name = self.get('category', category_token)['name'] + if ann['mask'] is None: + continue + mask = mask_decode(ann['mask']) + + # Draw masks for semantic segmentation and instance segmentation. + semseg_mask[mask == 1] = name_to_index[category_name] + instanceseg_mask[mask == 1] = i + + # Ensure that the number of instances in the instance segmentation mask is the same as the number of objects. + assert len(object_anns) == np.max(instanceseg_mask), \ + 'Error: There are {} objects but only {} instances ' \ + 'were drawn into the instance segmentation mask.'.format(len(object_anns), np.max(instanceseg_mask)) + + return semseg_mask, instanceseg_mask + + # ### Rendering methods. ### + + def render_image(self, + sd_token: str, + annotation_type: str = 'all', + with_category: bool = False, + with_attributes: bool = False, + object_tokens: List[str] = None, + surface_tokens: List[str] = None, + render_scale: float = 1.0, + box_line_width: int = -1, + font_size: int = None, + out_path: str = None) -> None: + """ + Renders an image (sample_data), optionally with annotations overlaid. + :param sd_token: The token of the sample_data to be rendered. + :param annotation_type: The types of annotations to draw on the image; there are four options: + 'all': Draw surfaces and objects, subject to any filtering done by object_tokens and surface_tokens. + 'surfaces': Draw only surfaces, subject to any filtering done by surface_tokens. + 'objects': Draw objects, subject to any filtering done by object_tokens. + 'none': Neither surfaces nor objects will be drawn. + :param with_category: Whether to include the category name at the top of a box. + :param with_attributes: Whether to include attributes in the label tags. Note that with_attributes=True + will only work if with_category=True. + :param object_tokens: List of object annotation tokens. If given, only these annotations are drawn. + :param surface_tokens: List of surface annotation tokens. If given, only these annotations are drawn. + :param render_scale: The scale at which the image will be rendered. Use 1.0 for the original image size. + :param box_line_width: The box line width in pixels. The default is -1. + If set to -1, box_line_width equals render_scale (rounded) to be larger in larger images. + :param font_size: Size of the text in the rendered image. Use None for the default size. + :param out_path: The path where we save the rendered image, or otherwise None. + If a path is provided, the plot is not shown to the user. + """ + # Validate inputs. + sample_data = self.get('sample_data', sd_token) + if not sample_data['is_key_frame']: + assert annotation_type != 'none', 'Error: Cannot render annotations for non keyframes!' + assert not with_attributes, 'Error: Cannot render attributes for non keyframes!' + if with_attributes: + assert with_category, 'In order to set with_attributes=True, with_category must be True.' + assert type(box_line_width) == int, 'Error: box_line_width must be an integer!' + if box_line_width == -1: + box_line_width = int(round(render_scale)) + + # Get image data. + self.check_sweeps(sample_data['filename']) + im_path = osp.join(self.dataroot, sample_data['filename']) + im = Image.open(im_path) + + # Initialize drawing. + if with_category and font_size is not None: + font = get_font(font_size=font_size) + else: + font = None + im = im.convert('RGBA') + draw = ImageDraw.Draw(im, 'RGBA') + + annotations_types = ['all', 'surfaces', 'objects', 'none'] + assert annotation_type in annotations_types, \ + 'Error: {} is not a valid option for annotation_type. ' \ + 'Only {} are allowed.'.format(annotation_type, annotations_types) + if annotation_type is not 'none': + if annotation_type == 'all' or annotation_type == 'surfaces': + # Load stuff / surface regions. + surface_anns = [o for o in self.surface_ann if o['sample_data_token'] == sd_token] + if surface_tokens is not None: + sd_surface_tokens = set([s['token'] for s in surface_anns if s['token']]) + assert set(surface_tokens).issubset(sd_surface_tokens), \ + 'Error: The provided surface_tokens do not belong to the sd_token!' + surface_anns = [o for o in surface_anns if o['token'] in surface_tokens] + + # Draw stuff / surface regions. + for ann in surface_anns: + # Get color and mask. + category_token = ann['category_token'] + category_name = self.get('category', category_token)['name'] + color = self.color_map[category_name] + if ann['mask'] is None: + continue + mask = mask_decode(ann['mask']) + + # Draw mask. The label is obvious from the color. + draw.bitmap((0, 0), Image.fromarray(mask * 128), fill=tuple(color + (128,))) + + if annotation_type == 'all' or annotation_type == 'objects': + # Load object instances. + object_anns = [o for o in self.object_ann if o['sample_data_token'] == sd_token] + if object_tokens is not None: + sd_object_tokens = set([o['token'] for o in object_anns if o['token']]) + assert set(object_tokens).issubset(sd_object_tokens), \ + 'Error: The provided object_tokens do not belong to the sd_token!' + object_anns = [o for o in object_anns if o['token'] in object_tokens] + + # Draw object instances. + for ann in object_anns: + # Get color, box, mask and name. + category_token = ann['category_token'] + category_name = self.get('category', category_token)['name'] + color = self.color_map[category_name] + bbox = ann['bbox'] + attr_tokens = ann['attribute_tokens'] + attributes = [self.get('attribute', at) for at in attr_tokens] + name = annotation_name(attributes, category_name, with_attributes=with_attributes) + if ann['mask'] is not None: + mask = mask_decode(ann['mask']) + + # Draw mask, rectangle and text. + draw.bitmap((0, 0), Image.fromarray(mask * 128), fill=tuple(color + (128,))) + draw.rectangle(bbox, outline=color, width=box_line_width) + if with_category: + draw.text((bbox[0], bbox[1]), name, font=font) + + # Plot the image. + (width, height) = im.size + pix_to_inch = 100 / render_scale + figsize = (height / pix_to_inch, width / pix_to_inch) + plt.figure(figsize=figsize) + plt.axis('off') + plt.imshow(im) + + # Save to disk. + if out_path is not None: + plt.savefig(out_path, bbox_inches='tight', dpi=2.295 * pix_to_inch, pad_inches=0) + plt.close() + + def render_trajectory(self, + sample_token: str, + rotation_yaw: float = 0.0, + center_key_pose: bool = True, + out_path: str = None) -> None: + """ + Render a plot of the trajectory for the clip surrounding the annotated keyframe. + A red cross indicates the starting point, a green dot the ego pose of the annotated keyframe. + :param sample_token: Sample token. + :param rotation_yaw: Rotation of the ego vehicle in the plot. + Set to None to use lat/lon coordinates. + Set to 0 to point in the driving direction at the time of the keyframe. + Set to any other value to rotate relative to the driving direction (in radians). + :param center_key_pose: Whether to center the trajectory on the key pose. + :param out_path: Optional path to save the rendered figure to disk. + If a path is provided, the plot is not shown to the user. + """ + # Get the translations or poses. + translations, key_index = self.get_trajectory(sample_token, rotation_yaw=rotation_yaw, + center_key_pose=center_key_pose) + + # Render translations. + plt.figure() + plt.plot(translations[:, 0], translations[:, 1]) + plt.plot(translations[key_index, 0], translations[key_index, 1], 'go', markersize=10) # Key image. + plt.plot(translations[0, 0], translations[0, 1], 'rx', markersize=10) # Start point. + max_dist = translations - translations[key_index, :] + max_dist = np.ceil(np.max(np.abs(max_dist)) * 1.05) # Leave some margin. + max_dist = np.maximum(10, max_dist) + plt.xlim([translations[key_index, 0] - max_dist, translations[key_index, 0] + max_dist]) + plt.ylim([translations[key_index, 1] - max_dist, translations[key_index, 1] + max_dist]) + plt.xlabel('x in meters') + plt.ylabel('y in meters') + + # Save to disk. + if out_path is not None: + plt.savefig(out_path, bbox_inches='tight', dpi=150, pad_inches=0) + plt.close() diff --git a/python-sdk/nuimages/scripts/render_images.py b/python-sdk/nuimages/scripts/render_images.py new file mode 100644 index 00000000..3f229136 --- /dev/null +++ b/python-sdk/nuimages/scripts/render_images.py @@ -0,0 +1,227 @@ +# nuScenes dev-kit. +# Code written by Holger Caesar, 2020. + +import argparse +import gc +import os +import random +from typing import List +from collections import defaultdict + +import cv2 +import tqdm + +from nuimages.nuimages import NuImages + + +def render_images(nuim: NuImages, + mode: str = 'all', + cam_name: str = None, + log_name: str = None, + sample_limit: int = 50, + filter_categories: List[str] = None, + out_type: str = 'image', + out_dir: str = '~/Downloads/nuImages', + cleanup: bool = True) -> None: + """ + Render a random selection of images and save them to disk. + Note: The images rendered here are keyframes only. + :param nuim: NuImages instance. + :param mode: What to render: + "image" for the image without annotations, + "annotated" for the image with annotations, + "trajectory" for a rendering of the trajectory of the vehice, + "all" to render all of the above separately. + :param cam_name: Only render images from a particular camera, e.g. "CAM_BACK'. + :param log_name: Only render images from a particular log, e.g. "n013-2018-09-04-13-30-50+0800". + :param sample_limit: Maximum number of samples (images) to render. Note that the mini split only includes 50 images. + :param filter_categories: Specify a list of object_ann category names. Every sample that is rendered must + contain annotations of any of those categories. + :param out_type: The output type as one of the following: + 'image': Renders a single image for the image keyframe of each sample. + 'video': Renders a video for all images/pcls in the clip associated with each sample. + :param out_dir: Folder to render the images to. + :param cleanup: Whether to delete images after rendering the video. Not relevant for out_type == 'image'. + """ + # Check and convert inputs. + assert out_type in ['image', 'video'], ' Error: Unknown out_type %s!' % out_type + all_modes = ['image', 'annotated', 'trajectory'] + assert mode in all_modes + ['all'], 'Error: Unknown mode %s!' % mode + assert not (out_type == 'video' and mode == 'trajectory'), 'Error: Cannot render "trajectory" for videos!' + + if mode == 'all': + if out_type == 'image': + modes = all_modes + elif out_type == 'video': + modes = [m for m in all_modes if m not in ['annotated', 'trajectory']] + else: + raise Exception('Error" Unknown mode %s!' % mode) + else: + modes = [mode] + + if filter_categories is not None: + category_names = [c['name'] for c in nuim.category] + for category_name in filter_categories: + assert category_name in category_names, 'Error: Invalid object_ann category %s!' % category_name + + # Create output folder. + out_dir = os.path.expanduser(out_dir) + if not os.path.isdir(out_dir): + os.makedirs(out_dir) + + # Filter by camera. + sample_tokens = [s['token'] for s in nuim.sample] + if cam_name is not None: + sample_tokens_cam = [] + for sample_token in sample_tokens: + sample = nuim.get('sample', sample_token) + key_camera_token = sample['key_camera_token'] + sensor = nuim.shortcut('sample_data', 'sensor', key_camera_token) + if sensor['channel'] == cam_name: + sample_tokens_cam.append(sample_token) + sample_tokens = sample_tokens_cam + + # Filter by log. + if log_name is not None: + sample_tokens_cleaned = [] + for sample_token in sample_tokens: + sample = nuim.get('sample', sample_token) + log = nuim.get('log', sample['log_token']) + if log['logfile'] == log_name: + sample_tokens_cleaned.append(sample_token) + sample_tokens = sample_tokens_cleaned + + # Filter samples by category. + if filter_categories is not None: + # Get categories in each sample. + sd_to_object_cat_names = defaultdict(lambda: set()) + for object_ann in nuim.object_ann: + category = nuim.get('category', object_ann['category_token']) + sd_to_object_cat_names[object_ann['sample_data_token']].add(category['name']) + + # Filter samples. + sample_tokens_cleaned = [] + for sample_token in sample_tokens: + sample = nuim.get('sample', sample_token) + key_camera_token = sample['key_camera_token'] + category_names = sd_to_object_cat_names[key_camera_token] + if any([c in category_names for c in filter_categories]): + sample_tokens_cleaned.append(sample_token) + sample_tokens = sample_tokens_cleaned + + # Get a random selection of samples. + random.shuffle(sample_tokens) + + # Limit number of samples. + sample_tokens = sample_tokens[:sample_limit] + + print('Rendering %s for mode %s to folder %s...' % (out_type, mode, out_dir)) + for sample_token in tqdm.tqdm(sample_tokens): + sample = nuim.get('sample', sample_token) + log = nuim.get('log', sample['log_token']) + log_name = log['logfile'] + key_camera_token = sample['key_camera_token'] + sensor = nuim.shortcut('sample_data', 'sensor', key_camera_token) + sample_cam_name = sensor['channel'] + sd_tokens = nuim.get_sample_content(sample_token) + + # We cannot render a video if there are missing camera sample_datas. + if len(sd_tokens) < 13 and out_type == 'video': + print('Warning: Skipping video for sample token %s, as not all 13 frames exist!' % sample_token) + continue + + for mode in modes: + out_path_prefix = os.path.join(out_dir, '%s_%s_%s_%s' % (log_name, sample_token, sample_cam_name, mode)) + if out_type == 'image': + write_image(nuim, key_camera_token, mode, '%s.jpg' % out_path_prefix) + elif out_type == 'video': + write_video(nuim, sd_tokens, mode, out_path_prefix, cleanup=cleanup) + + +def write_video(nuim: NuImages, + sd_tokens: List[str], + mode: str, + out_path_prefix: str, + cleanup: bool = True) -> None: + """ + Render a video by combining all the images of type mode for each sample_data. + :param nuim: NuImages instance. + :param sd_tokens: All sample_data tokens in chronological order. + :param mode: The mode - see render_images(). + :param out_path_prefix: The file prefix used for the images and video. + :param cleanup: Whether to delete images after rendering the video. + """ + # Loop through each frame to create the video. + out_paths = [] + for i, sd_token in enumerate(sd_tokens): + out_path = '%s_%d.jpg' % (out_path_prefix, i) + out_paths.append(out_path) + write_image(nuim, sd_token, mode, out_path) + + # Create video. + first_im = cv2.imread(out_paths[0]) + freq = 2 # Display frequency (Hz). + fourcc = cv2.VideoWriter_fourcc(*'MJPG') + video_path = '%s.avi' % out_path_prefix + out = cv2.VideoWriter(video_path, fourcc, freq, first_im.shape[1::-1]) + + # Load each image and add to the video. + for out_path in out_paths: + im = cv2.imread(out_path) + out.write(im) + + # Delete temporary image if requested. + if cleanup: + os.remove(out_path) + + # Finalize video. + out.release() + + +def write_image(nuim: NuImages, sd_token: str, mode: str, out_path: str) -> None: + """ + Render a single image of type mode for the given sample_data. + :param nuim: NuImages instance. + :param sd_token: The sample_data token. + :param mode: The mode - see render_images(). + :param out_path: The file to write the image to. + """ + if mode == 'annotated': + nuim.render_image(sd_token, annotation_type='all', out_path=out_path) + elif mode == 'image': + nuim.render_image(sd_token, annotation_type='none', out_path=out_path) + elif mode == 'trajectory': + sample_data = nuim.get('sample_data', sd_token) + nuim.render_trajectory(sample_data['sample_token'], out_path=out_path) + else: + raise Exception('Error: Unknown mode %s!' % mode) + + # Trigger garbage collection to avoid memory overflow from the render functions. + gc.collect() + + +if __name__ == '__main__': + parser = argparse.ArgumentParser(description='Render a random selection of images and save them to disk.') + parser.add_argument('--seed', type=int, default=42) # Set to 0 to disable. + parser.add_argument('--version', type=str, default='v1.0-mini') + parser.add_argument('--dataroot', type=str, default='/data/sets/nuimages') + parser.add_argument('--verbose', type=int, default=1) + parser.add_argument('--mode', type=str, default='all') + parser.add_argument('--cam_name', type=str, default=None) + parser.add_argument('--log_name', type=str, default=None) + parser.add_argument('--sample_limit', type=int, default=50) + parser.add_argument('--filter_categories', action='append') + parser.add_argument('--out_type', type=str, default='image') + parser.add_argument('--out_dir', type=str, default='~/Downloads/nuImages') + args = parser.parse_args() + + # Set random seed for reproducible image selection. + if args.seed != 0: + random.seed(args.seed) + + # Initialize NuImages class. + nuim_ = NuImages(version=args.version, dataroot=args.dataroot, verbose=bool(args.verbose), lazy=False) + + # Render images. + render_images(nuim_, mode=args.mode, cam_name=args.cam_name, log_name=args.log_name, sample_limit=args.sample_limit, + filter_categories=args.filter_categories, out_type=args.out_type, out_dir=args.out_dir) diff --git a/python-sdk/nuimages/scripts/render_rare_classes.py b/python-sdk/nuimages/scripts/render_rare_classes.py new file mode 100644 index 00000000..c09dcf1d --- /dev/null +++ b/python-sdk/nuimages/scripts/render_rare_classes.py @@ -0,0 +1,86 @@ +# nuScenes dev-kit. +# Code written by Holger Caesar, 2020. + +import argparse +import random +from collections import defaultdict +from typing import Dict, Any, List + +from nuimages.nuimages import NuImages +from nuimages.scripts.render_images import render_images + + +def render_rare_classes(nuim: NuImages, + render_args: Dict[str, Any], + filter_categories: List[str] = None, + max_frequency: float = 0.1) -> None: + """ + Wrapper around render_images() that renders images with rare classes. + :param nuim: NuImages instance. + :param render_args: The render arguments passed on to the render function. See render_images(). + :param filter_categories: Specify a list of object_ann category names. + Every sample that is rendered must contain annotations of any of those categories. + Filter_categories are a applied on top of the frequency filering. + :param max_frequency: The maximum relative frequency of the categories, at least one of which is required to be + present in the image. E.g. 0.1 indicates that one of the classes that account for at most 10% of the annotations + is present. + """ + # Checks. + assert 'filter_categories' not in render_args.keys(), \ + 'Error: filter_categories is a separate argument and should not be part of render_args!' + assert 0 <= max_frequency <= 1, 'Error: max_frequency must be a ratio between 0 and 1!' + + # Compute object class frequencies. + object_freqs = defaultdict(lambda: 0) + for object_ann in nuim.object_ann: + category = nuim.get('category', object_ann['category_token']) + object_freqs[category['name']] += 1 + + # Find rare classes. + total_freqs = len(nuim.object_ann) + filter_categories_freq = sorted([k for (k, v) in object_freqs.items() if v / total_freqs <= max_frequency]) + assert len(filter_categories_freq) > 0, 'Error: No classes found with the specified max_frequency!' + print('The rare classes are: %s' % filter_categories_freq) + + # If specified, additionally filter these categories by what was requested. + if filter_categories is None: + filter_categories = filter_categories_freq + else: + filter_categories = list(set(filter_categories_freq).intersection(set(filter_categories))) + assert len(filter_categories) > 0, 'Error: No categories left after applying filter_categories!' + + # Call render function. + render_images(nuim, filter_categories=filter_categories, **render_args) + + +if __name__ == '__main__': + parser = argparse.ArgumentParser(description='Render a random selection of images and save them to disk.') + parser.add_argument('--seed', type=int, default=42) # Set to 0 to disable. + parser.add_argument('--version', type=str, default='v1.0-mini') + parser.add_argument('--dataroot', type=str, default='/data/sets/nuimages') + parser.add_argument('--verbose', type=int, default=1) + parser.add_argument('--mode', type=str, default='all') + parser.add_argument('--cam_name', type=str, default=None) + parser.add_argument('--sample_limit', type=int, default=100) + parser.add_argument('--max_frequency', type=float, default=0.1) + parser.add_argument('--filter_categories', action='append') + parser.add_argument('--out_type', type=str, default='image') + parser.add_argument('--out_dir', type=str, default='~/Downloads/nuImages') + args = parser.parse_args() + + # Set random seed for reproducible image selection. + if args.seed != 0: + random.seed(args.seed) + + # Initialize NuImages class. + nuim_ = NuImages(version=args.version, dataroot=args.dataroot, verbose=bool(args.verbose), lazy=False) + + # Render images. + _render_args = { + 'mode': args.mode, + 'cam_name': args.cam_name, + 'sample_limit': args.sample_limit, + 'out_type': args.out_type, + 'out_dir': args.out_dir + } + render_rare_classes(nuim_, _render_args, filter_categories=args.filter_categories, max_frequency=args.max_frequency) diff --git a/python-sdk/nuimages/tests/__init__.py b/python-sdk/nuimages/tests/__init__.py new file mode 100644 index 00000000..e69de29b diff --git a/python-sdk/nuimages/tests/assert_download.py b/python-sdk/nuimages/tests/assert_download.py new file mode 100644 index 00000000..7dbb3d84 --- /dev/null +++ b/python-sdk/nuimages/tests/assert_download.py @@ -0,0 +1,46 @@ +# nuScenes dev-kit. +# Code written by Holger Caesar, 2020. + +import argparse +import os + +from tqdm import tqdm + +from nuimages import NuImages + + +def verify_setup(nuim: NuImages): + """ + Script to verify that the nuImages installation is complete. + Note that this may take several minutes or hours. + """ + + # Check that each sample_data file exists. + print('Checking that sample_data files are complete...') + for sd in tqdm(nuim.sample_data): + file_path = os.path.join(nuim.dataroot, sd['filename']) + assert os.path.exists(file_path), 'Error: Missing sample_data at: %s' % file_path + + +if __name__ == "__main__": + + # Settings. + parser = argparse.ArgumentParser(description='Test that the installed dataset is complete.', + formatter_class=argparse.ArgumentDefaultsHelpFormatter) + parser.add_argument('--dataroot', type=str, default='/data/sets/nuimages', + help='Default nuImages data directory.') + parser.add_argument('--version', type=str, default='v1.0-train', + help='Which version of the nuImages dataset to evaluate on, e.g. v1.0-train.') + parser.add_argument('--verbose', type=int, default=1, + help='Whether to print to stdout.') + + args = parser.parse_args() + dataroot = args.dataroot + version = args.version + verbose = bool(args.verbose) + + # Init. + nuim_ = NuImages(version=version, verbose=verbose, dataroot=dataroot) + + # Verify data blobs. + verify_setup(nuim_) diff --git a/python-sdk/nuimages/tests/test_attributes.py b/python-sdk/nuimages/tests/test_attributes.py new file mode 100644 index 00000000..264c933e --- /dev/null +++ b/python-sdk/nuimages/tests/test_attributes.py @@ -0,0 +1,115 @@ +# nuScenes dev-kit. +# Code written by Holger Caesar, 2020. + +import os +import unittest +from typing import Any + +from nuimages.nuimages import NuImages + + +class TestAttributes(unittest.TestCase): + + def __init__(self, _: Any = None, version: str = 'v1.0-mini', dataroot: str = None): + """ + Initialize TestAttributes. + Note: The second parameter is a dummy parameter required by the TestCase class. + :param version: The NuImages version. + :param dataroot: The root folder where the dataset is installed. + """ + super().__init__() + + self.version = version + if dataroot is None: + self.dataroot = os.environ['NUIMAGES'] + else: + self.dataroot = dataroot + self.nuim = NuImages(version=self.version, dataroot=self.dataroot, verbose=False) + self.valid_attributes = { + 'animal': ['pedestrian', 'vertical_position'], + 'human.pedestrian.adult': ['pedestrian'], + 'human.pedestrian.child': ['pedestrian'], + 'human.pedestrian.construction_worker': ['pedestrian'], + 'human.pedestrian.personal_mobility': ['cycle'], + 'human.pedestrian.police_officer': ['pedestrian'], + 'human.pedestrian.stroller': [], + 'human.pedestrian.wheelchair': [], + 'movable_object.barrier': [], + 'movable_object.debris': [], + 'movable_object.pushable_pullable': [], + 'movable_object.trafficcone': [], + 'static_object.bicycle_rack': [], + 'vehicle.bicycle': ['cycle'], + 'vehicle.bus.bendy': ['vehicle'], + 'vehicle.bus.rigid': ['vehicle'], + 'vehicle.car': ['vehicle'], + 'vehicle.construction': ['vehicle'], + 'vehicle.ego': [], + 'vehicle.emergency.ambulance': ['vehicle', 'vehicle_light.emergency'], + 'vehicle.emergency.police': ['vehicle', 'vehicle_light.emergency'], + 'vehicle.motorcycle': ['cycle'], + 'vehicle.trailer': ['vehicle'], + 'vehicle.truck': ['vehicle'] + } + + def runTest(self) -> None: + """ + Dummy function required by the TestCase class. + """ + pass + + def test_object_anns(self, print_only: bool = False) -> None: + """ + For every object_ann, check that all the required attributes for that class are present. + :param print_only: Whether to throw assertion errors or just print a warning message. + """ + att_token_to_name = {att['token']: att['name'] for att in self.nuim.attribute} + cat_token_to_name = {cat['token']: cat['name'] for cat in self.nuim.category} + for object_ann in self.nuim.object_ann: + # Collect the attribute names used here. + category_name = cat_token_to_name[object_ann['category_token']] + sample_token = self.nuim.get('sample_data', object_ann['sample_data_token'])['sample_token'] + + cur_att_names = [] + for attribute_token in object_ann['attribute_tokens']: + attribute_name = att_token_to_name[attribute_token] + cur_att_names.append(attribute_name) + + # Compare to the required attribute name prefixes. + # Check that the length is correct. + required_att_names = self.valid_attributes[category_name] + condition = len(cur_att_names) == len(required_att_names) + if not condition: + debug_output = { + 'sample_token': sample_token, + 'category_name': category_name, + 'cur_att_names': cur_att_names, + 'required_att_names': required_att_names + } + error_msg = 'Error: ' + str(debug_output) + if print_only: + print(error_msg) + else: + self.assertTrue(condition, error_msg) + + # Skip next check if we already saw an error. + continue + + # Check that they are really the same. + for required in required_att_names: + condition = any([cur.startswith(required + '.') for cur in cur_att_names]) + if not condition: + error_msg = 'Errors: Required attribute ''%s'' not in %s for class %s! (sample %s)' \ + % (required, cur_att_names, category_name, sample_token) + if print_only: + print(error_msg) + else: + self.assertTrue(condition, error_msg) + + +if __name__ == '__main__': + # Runs the tests without aborting on error. + for nuim_version in ['v1.0-train', 'v1.0-val', 'v1.0-test', 'v1.0-mini']: + print('Running TestAttributes for version %s...' % nuim_version) + test = TestAttributes(version=nuim_version) + test.test_object_anns(print_only=True) diff --git a/python-sdk/nuimages/tests/test_foreign_keys.py b/python-sdk/nuimages/tests/test_foreign_keys.py new file mode 100644 index 00000000..df9729a6 --- /dev/null +++ b/python-sdk/nuimages/tests/test_foreign_keys.py @@ -0,0 +1,147 @@ +# nuScenes dev-kit. +# Code written by Holger Caesar, 2020. + +import itertools +import os +import unittest +from collections import defaultdict +from typing import List, Dict, Any + +from nuimages.nuimages import NuImages + + +class TestForeignKeys(unittest.TestCase): + def __init__(self, _: Any = None, version: str = 'v1.0-mini', dataroot: str = None): + """ + Initialize TestForeignKeys. + Note: The second parameter is a dummy parameter required by the TestCase class. + :param version: The NuImages version. + :param dataroot: The root folder where the dataset is installed. + """ + super().__init__() + + self.version = version + if dataroot is None: + self.dataroot = os.environ['NUIMAGES'] + else: + self.dataroot = dataroot + self.nuim = NuImages(version=self.version, dataroot=self.dataroot, verbose=False) + + def runTest(self) -> None: + """ + Dummy function required by the TestCase class. + """ + pass + + def test_foreign_keys(self) -> None: + """ + Test that every foreign key points to a valid token. + """ + # Index the tokens of all tables. + index = dict() + for table_name in self.nuim.table_names: + print('Indexing table %s...' % table_name) + table: list = self.nuim.__getattr__(table_name) + tokens = [row['token'] for row in table] + index[table_name] = set(tokens) + + # Go through each table and check the foreign_keys. + for table_name in self.nuim.table_names: + table: List[Dict[str, Any]] = self.nuim.__getattr__(table_name) + if self.version.endswith('-test') and len(table) == 0: # Skip test annotations. + continue + keys = table[0].keys() + + # Check 1-to-1 link. + one_to_one_names = [k for k in keys if k.endswith('_token') and not k.startswith('key_')] + for foreign_key_name in one_to_one_names: + print('Checking one-to-one key %s in table %s...' % (foreign_key_name, table_name)) + foreign_table_name = foreign_key_name.replace('_token', '') + foreign_tokens = set([row[foreign_key_name] for row in table]) + + # Check all tokens are valid. + if self.version.endswith('-mini') and foreign_table_name == 'category': + continue # Mini does not cover all categories. + foreign_index = index[foreign_table_name] + self.assertTrue(foreign_tokens.issubset(foreign_index)) + + # Check all tokens are covered. + # By default we check that all tokens are covered. Exceptions are listed below. + if table_name == 'object_ann': + if foreign_table_name == 'category': + remove = set([cat['token'] for cat in self.nuim.category if cat['name'] + in ['vehicle.ego', 'flat.driveable_surface']]) + foreign_index = foreign_index.difference(remove) + elif foreign_table_name == 'sample_data': + foreign_index = None # Skip as sample_datas may have no object_ann. + elif table_name == 'surface_ann': + if foreign_table_name == 'category': + remove = set([cat['token'] for cat in self.nuim.category if cat['name'] + not in ['vehicle.ego', 'flat.driveable_surface']]) + foreign_index = foreign_index.difference(remove) + elif foreign_table_name == 'sample_data': + foreign_index = None # Skip as sample_datas may have no surface_ann. + if foreign_index is not None: + self.assertEqual(foreign_tokens, foreign_index) + + # Check 1-to-many link. + one_to_many_names = [k for k in keys if k.endswith('_tokens')] + for foreign_key_name in one_to_many_names: + print('Checking one-to-many key %s in table %s...' % (foreign_key_name, table_name)) + foreign_table_name = foreign_key_name.replace('_tokens', '') + foreign_tokens_nested = [row[foreign_key_name] for row in table] + foreign_tokens = set(itertools.chain(*foreign_tokens_nested)) + + # Check that all tokens are valid. + foreign_index = index[foreign_table_name] + self.assertTrue(foreign_tokens.issubset(foreign_index)) + + # Check all tokens are covered. + if self.version.endswith('-mini') and foreign_table_name == 'attribute': + continue # Mini does not cover all categories. + if foreign_index is not None: + self.assertEqual(foreign_tokens, foreign_index) + + # Check prev and next. + prev_next_names = [k for k in keys if k in ['previous', 'next']] + for foreign_key_name in prev_next_names: + print('Checking prev-next key %s in table %s...' % (foreign_key_name, table_name)) + foreign_table_name = table_name + foreign_tokens = set([row[foreign_key_name] for row in table if len(row[foreign_key_name]) > 0]) + + # Check that all tokens are valid. + foreign_index = index[foreign_table_name] + self.assertTrue(foreign_tokens.issubset(foreign_index)) + + def test_prev_next(self) -> None: + """ + Test that the prev and next points in sample_data cover all entries and have the correct ordering. + """ + # Register all sample_datas. + sample_to_sample_datas = defaultdict(lambda: []) + for sample_data in self.nuim.sample_data: + sample_to_sample_datas[sample_data['sample_token']].append(sample_data['token']) + + print('Checking prev-next pointers for completeness and correct ordering...') + for sample in self.nuim.sample: + # Compare the above sample_datas against those retrieved by using prev and next pointers. + sd_tokens_pointers = self.nuim.get_sample_content(sample['token']) + sd_tokens_all = sample_to_sample_datas[sample['token']] + self.assertTrue(set(sd_tokens_pointers) == set(sd_tokens_all), + 'Error: Inconsistency in prev/next pointers!') + + timestamps = [] + for sd_token in sd_tokens_pointers: + sample_data = self.nuim.get('sample_data', sd_token) + timestamps.append(sample_data['timestamp']) + self.assertTrue(sorted(timestamps) == timestamps, 'Error: Timestamps not properly sorted!') + + +if __name__ == '__main__': + # Runs the tests without aborting on error. + for nuim_version in ['v1.0-train', 'v1.0-val', 'v1.0-test', 'v1.0-mini']: + print('Running TestForeignKeys for version %s...' % nuim_version) + test = TestForeignKeys(version=nuim_version) + test.test_foreign_keys() + test.test_prev_next() + print() diff --git a/python-sdk/nuimages/utils/__init__.py b/python-sdk/nuimages/utils/__init__.py new file mode 100644 index 00000000..e69de29b diff --git a/python-sdk/nuimages/utils/test_nuimages.py b/python-sdk/nuimages/utils/test_nuimages.py new file mode 100644 index 00000000..7e5e2ded --- /dev/null +++ b/python-sdk/nuimages/utils/test_nuimages.py @@ -0,0 +1,26 @@ +# nuScenes dev-kit. +# Code written by Holger Caesar, 2020. + +import os +import unittest + +from nuimages import NuImages + + +class TestNuImages(unittest.TestCase): + + def test_load(self): + """ + Loads up NuImages. + This is intended to simply run the NuImages class to check for import errors, typos, etc. + """ + + assert 'NUIMAGES' in os.environ, 'Set NUIMAGES env. variable to enable tests.' + nuim = NuImages(version='v1.0-mini', dataroot=os.environ['NUIMAGES'], verbose=False) + + # Trivial assert statement + self.assertEqual(nuim.table_root, os.path.join(os.environ['NUIMAGES'], 'v1.0-mini')) + + +if __name__ == '__main__': + unittest.main() diff --git a/python-sdk/nuimages/utils/utils.py b/python-sdk/nuimages/utils/utils.py new file mode 100644 index 00000000..6ce3135c --- /dev/null +++ b/python-sdk/nuimages/utils/utils.py @@ -0,0 +1,106 @@ +# nuScenes dev-kit. +# Code written by Asha Asvathaman & Holger Caesar, 2020. + +import base64 +import os +from typing import List, Dict +import warnings + +import matplotlib.font_manager +from PIL import ImageFont +import numpy as np +from pycocotools import mask as cocomask + + +def annotation_name(attributes: List[dict], + category_name: str, + with_attributes: bool = False) -> str: + """ + Returns the "name" of an annotation, optionally including the attributes. + :param attributes: The attribute dictionary. + :param category_name: Name of the object category. + :param with_attributes: Whether to print the attributes alongside the category name. + :return: A human readable string describing the annotation. + """ + outstr = category_name + + if with_attributes: + atts = [attribute['name'] for attribute in attributes] + if len(atts) > 0: + outstr = outstr + "--" + '.'.join(atts) + + return outstr + + +def mask_decode(mask: dict) -> np.ndarray: + """ + Decode the mask from base64 string to binary string, then feed it to the external pycocotools library to get a mask. + :param mask: The mask dictionary with fields `size` and `counts`. + :return: A numpy array representing the binary mask for this class. + """ + # Note that it is essential to copy the mask here. If we use the same variable we will overwrite the NuImage class + # and cause the Jupyter Notebook to crash on some systems. + new_mask = mask.copy() + new_mask['counts'] = base64.b64decode(mask['counts']) + return cocomask.decode(new_mask) + + +def get_font(fonts_valid: List[str] = None, font_size: int = 15) -> ImageFont: + """ + Check if there is a desired font present in the user's system. If there is, use that font; otherwise, use a default + font. + :param fonts_valid: A list of fonts which are desirable. + :param font_size: The size of the font to set. Note that if the default font is used, then the font size + cannot be set. + :return: An ImageFont object to use as the font in a PIL image. + """ + # If there are no desired fonts supplied, use a hardcoded list of fonts which are desirable. + if fonts_valid is None: + fonts_valid = ['FreeSerif.ttf', 'FreeSans.ttf', 'Century.ttf', 'Calibri.ttf', 'arial.ttf'] + + # Find a list of fonts within the user's system. + fonts_in_sys = matplotlib.font_manager.findSystemFonts(fontpaths=None, fontext='ttf') + # Sort the list of fonts to ensure that the desired fonts are always found in the same order. + fonts_in_sys = sorted(fonts_in_sys) + # Of all the fonts found in the user's system, check if any of them are desired. + for font_in_sys in fonts_in_sys: + if any(os.path.basename(font_in_sys) in s for s in fonts_valid): + return ImageFont.truetype(font_in_sys, font_size) + + # If none of the fonts in the user's system are desirable, then use the default font. + warnings.warn('No suitable fonts were found in your system. ' + 'A default font will be used instead (the font size will not be adjustable).') + return ImageFont.load_default() + + +def name_to_index_mapping(category: List[dict]) -> Dict[str, int]: + """ + Build a mapping from name to index to look up index in O(1) time. + :param category: The nuImages category table. + :return: The mapping from category name to category index. + """ + # The 0 index is reserved for non-labelled background; thus, the categories should start from index 1. + # Also, sort the categories before looping so that the order is always the same (alphabetical). + name_to_index = dict() + i = 1 + sorted_category: List = sorted(category.copy(), key=lambda k: k['name']) + for c in sorted_category: + # Ignore the vehicle.ego and flat.driveable_surface classes first; they will be mapped later. + if c['name'] != 'vehicle.ego' and c['name'] != 'flat.driveable_surface': + name_to_index[c['name']] = i + i += 1 + + assert max(name_to_index.values()) < 24, \ + 'Error: There are {} classes (excluding vehicle.ego and flat.driveable_surface), ' \ + 'but there should be 23. Please check your category.json'.format(max(name_to_index.values())) + + # Now map the vehicle.ego and flat.driveable_surface classes. + name_to_index['flat.driveable_surface'] = 24 + name_to_index['vehicle.ego'] = 31 + + # Ensure that each class name is uniquely paired with a class index, and vice versa. + assert len(name_to_index) == len(set(name_to_index.values())), \ + 'Error: There are {} class names but {} class indices'.format(len(name_to_index), + len(set(name_to_index.values()))) + + return name_to_index diff --git a/python-sdk/nuscenes/can_bus/README.md b/python-sdk/nuscenes/can_bus/README.md index bd1a10ff..a3fe106b 100644 --- a/python-sdk/nuscenes/can_bus/README.md +++ b/python-sdk/nuscenes/can_bus/README.md @@ -80,8 +80,8 @@ Format: `scene_0001_pose.json` The current pose of the ego vehicle, sampled at 50Hz. - accel: \[3\] Acceleration vector in the ego vehicle frame in m/s/s. - orientation: \[4\] The rotation vector in the ego vehicle frame. -- pos: \[3\] The position (x, y, z) in meters in the global frame. This is identical to the [nuScenes ego pose](https://github.com/nutonomy/nuscenes-devkit/blob/master/schema.md#ego_pose), but sampled at a higher frequency. -- rotation_rate: \[3\] The angular velocity vector of the vehicle in rad/s. This is expressed in the ego vehicle frame. +- pos: \[3\] The position (x, y, z) in meters in the global frame. This is identical to the [nuScenes ego pose](https://github.com/nutonomy/nuscenes-devkit/blob/master/docs/schema_nuscenes.md#ego_pose), but sampled at a higher frequency. +- rotation_rate: \[3\] The angular velocity vector of the vehicle in rad/s. This is expressed in the ego vehicle frame. - vel: \[3\] The velocity in m/s, expressed in the ego vehicle frame. ### Steer Angle Feedback diff --git a/python-sdk/nuscenes/can_bus/can_bus_api.py b/python-sdk/nuscenes/can_bus/can_bus_api.py index 55c58230..5d5e0f6c 100644 --- a/python-sdk/nuscenes/can_bus/can_bus_api.py +++ b/python-sdk/nuscenes/can_bus/can_bus_api.py @@ -101,7 +101,7 @@ def plot_baseline_route(self, plt.figure() plt.plot(route[:, 0], route[:, 1]) plt.plot(pose[:, 0], pose[:, 1]) - plt.plot(pose[0, 0], pose[0, 1], 'rx', MarkerSize=10) + plt.plot(pose[0, 0], pose[0, 1], 'rx', markersize=10) plt.legend(('Route', 'Pose', 'Start')) plt.xlabel('Map coordinate x in m') plt.ylabel('Map coordinate y in m') diff --git a/python-sdk/nuscenes/eval/detection/README.md b/python-sdk/nuscenes/eval/detection/README.md index cd2dcfe9..92f4ca74 100644 --- a/python-sdk/nuscenes/eval/detection/README.md +++ b/python-sdk/nuscenes/eval/detection/README.md @@ -47,7 +47,7 @@ Note that the [evaluation server](http://evalai.cloudcv.org/web/challenges/chall ## Submission rules ### Detection-specific rules -* The maximum time window of past sensor data and ego poses that may be used at inference time is approximately 0.5s (at most 6 camera images, 6 radar sweeps and 10 lidar sweeps). At training time there are no restrictions. +* The maximum time window of past sensor data and ego poses that may be used at inference time is approximately 0.5s (at most 6 *past* camera images, 6 *past* radar sweeps and 10 *past* lidar sweeps). At training time there are no restrictions. ### General rules * We release annotations for the train and val set, but not for the test set. @@ -112,7 +112,7 @@ Some of these only have a handful of samples. Hence we merge similar classes and remove rare classes. This results in 10 classes for the detection challenge. Below we show the table of detection classes and their counterparts in the nuScenes dataset. -For more information on the classes and their frequencies, see [this page](https://www.nuscenes.org/data-annotation). +For more information on the classes and their frequencies, see [this page](https://www.nuscenes.org/nuscenes#data-annotation). | nuScenes detection class| nuScenes general class | | --- | --- | diff --git a/python-sdk/nuscenes/eval/detection/tests/test_evaluate.py b/python-sdk/nuscenes/eval/detection/tests/test_evaluate.py index 02accd91..0808b030 100644 --- a/python-sdk/nuscenes/eval/detection/tests/test_evaluate.py +++ b/python-sdk/nuscenes/eval/detection/tests/test_evaluate.py @@ -74,7 +74,7 @@ def random_attr(name: str) -> str: if nusc.get('scene', sample['scene_token'])['name'] in splits[split]: val_samples.append(sample) - for sample in tqdm(val_samples): + for sample in tqdm(val_samples, leave=False): sample_res = [] for ann_token in sample['anns']: ann = nusc.get('sample_annotation', ann_token) diff --git a/python-sdk/nuscenes/eval/prediction/tests/test_metrics.py b/python-sdk/nuscenes/eval/prediction/tests/test_metrics.py index caadac9a..26e37b9e 100644 --- a/python-sdk/nuscenes/eval/prediction/tests/test_metrics.py +++ b/python-sdk/nuscenes/eval/prediction/tests/test_metrics.py @@ -275,7 +275,7 @@ class TestOffRoadRate(unittest.TestCase): def _do_test(self, map_name, predictions, answer): with patch.object(PredictHelper, 'get_map_name_from_sample_token') as get_map_name: get_map_name.return_value = map_name - nusc = NuScenes('v1.0-mini', dataroot=os.environ['NUSCENES']) + nusc = NuScenes('v1.0-mini', dataroot=os.environ['NUSCENES'], verbose=False) helper = PredictHelper(nusc) off_road_rate = metrics.OffRoadRate(helper, [metrics.RowMean()]) diff --git a/python-sdk/nuscenes/eval/tracking/README.md b/python-sdk/nuscenes/eval/tracking/README.md index 578727eb..6494f402 100644 --- a/python-sdk/nuscenes/eval/tracking/README.md +++ b/python-sdk/nuscenes/eval/tracking/README.md @@ -28,9 +28,9 @@ They are based upon the [nuScenes dataset](http://www.nuScenes.org) \[1\] and th # Getting started To participate in the tracking challenge you should first [get familiar with the nuScenes dataset and install it](https://github.com/nutonomy/nuscenes-devkit/blob/master/README.md). -In particular, the [tutorial](https://www.nuscenes.org/tutorial) explains how to use the various database tables. +In particular, the [tutorial](https://www.nuscenes.org/nuscenes#tutorials) explains how to use the various database tables. The tutorial also shows how to retrieve the images, lidar pointclouds and annotations for each sample (timestamp). -To retrieve the instance/track of an object, take a look at the [instance table](https://github.com/nutonomy/nuscenes-devkit/blob/master/schema.md#instance). +To retrieve the instance/track of an object, take a look at the [instance table](https://github.com/nutonomy/nuscenes-devkit/blob/master/docs/schema_nuscenes.md#instance). Now you are ready to train your tracking algorithm on the dataset. If you are only interested in tracking (as opposed to detection), you can use the provided detections for several state-of-the-art methods [below](#baselines). To evaluate the tracking results, use `evaluate.py` in the [eval folder](https://github.com/nutonomy/nuscenes-devkit/tree/master/python-sdk/nuscenes/eval/tracking). @@ -101,7 +101,7 @@ submission { } ``` For the predictions we create a new database table called `sample_result`. -The `sample_result` table is designed to mirror the [`sample_annotation`](https://github.com/nutonomy/nuscenes-devkit/blob/master/schema.md#sample_annotation) table. +The `sample_result` table is designed to mirror the [`sample_annotation`](https://github.com/nutonomy/nuscenes-devkit/blob/master/docs/schema_nuscenes.md#sample_annotation) table. This allows for processing of results and annotations using the same tools. A `sample_result` is a dictionary defined as follows: ``` @@ -122,12 +122,12 @@ sample_result { Note that except for the `tracking_*` fields the result format is identical to the [detection challenge](https://www.nuscenes.org/object-detection). ## Classes -The nuScenes dataset comes with annotations for 23 classes ([details](https://www.nuscenes.org/data-annotation)). +The nuScenes dataset comes with annotations for 23 classes ([details](https://www.nuscenes.org/nuscenes#data-annotation)). Some of these only have a handful of samples. Hence we merge similar classes and remove rare classes. From these *detection challenge classes* we further remove the classes *barrier*, *trafficcone* and *construction_vehicle*, as these are typically static. Below we show the table of the 7 tracking classes and their counterparts in the nuScenes dataset. -For more information on the classes and their frequencies, see [this page](https://www.nuscenes.org/data-annotation). +For more information on the classes and their frequencies, see [this page](https://www.nuscenes.org/nuscenes#data-annotation). | nuScenes general class | nuScenes tracking class | | --- | --- | diff --git a/python-sdk/nuscenes/lidarseg/__init__.py b/python-sdk/nuscenes/lidarseg/__init__.py new file mode 100644 index 00000000..e69de29b diff --git a/python-sdk/nuscenes/lidarseg/class_histogram.py b/python-sdk/nuscenes/lidarseg/class_histogram.py new file mode 100644 index 00000000..d513fa6a --- /dev/null +++ b/python-sdk/nuscenes/lidarseg/class_histogram.py @@ -0,0 +1,199 @@ +import os +import time +from typing import List, Tuple + +import matplotlib.pyplot as plt +from matplotlib.ticker import FuncFormatter, ScalarFormatter +import matplotlib.transforms as mtrans +import numpy as np + +from nuscenes import NuScenes +from nuscenes.utils.color_map import get_colormap + + +def truncate_class_name(class_name: str) -> str: + """ + Truncate a given class name according to a pre-defined map. + :param class_name: The long form (i.e. original form) of the class name. + :return: The truncated form of the class name. + """ + + string_mapper = { + "noise": 'noise', + "human.pedestrian.adult": 'adult', + "human.pedestrian.child": 'child', + "human.pedestrian.wheelchair": 'wheelchair', + "human.pedestrian.stroller": 'stroller', + "human.pedestrian.personal_mobility": 'p.mobility', + "human.pedestrian.police_officer": 'police', + "human.pedestrian.construction_worker": 'worker', + "animal": 'animal', + "vehicle.car": 'car', + "vehicle.motorcycle": 'motorcycle', + "vehicle.bicycle": 'bicycle', + "vehicle.bus.bendy": 'bus.bendy', + "vehicle.bus.rigid": 'bus.rigid', + "vehicle.truck": 'truck', + "vehicle.construction": 'constr. veh', + "vehicle.emergency.ambulance": 'ambulance', + "vehicle.emergency.police": 'police car', + "vehicle.trailer": 'trailer', + "movable_object.barrier": 'barrier', + "movable_object.trafficcone": 'trafficcone', + "movable_object.pushable_pullable": 'push/pullable', + "movable_object.debris": 'debris', + "static_object.bicycle_rack": 'bicycle racks', + "flat.driveable_surface": 'driveable', + "flat.sidewalk": 'sidewalk', + "flat.terrain": 'terrain', + "flat.other": 'flat.other', + "static.manmade": 'manmade', + "static.vegetation": 'vegetation', + "static.other": 'static.other', + "vehicle.ego": "ego" + } + + return string_mapper[class_name] + + +def render_lidarseg_histogram(nusc: NuScenes, + sort_by: str = 'count_desc', + chart_title: str = None, + x_label: str = None, + y_label: str = "Lidar points (logarithmic)", + y_log_scale: bool = True, + verbose: bool = True, + font_size: int = 20, + save_as_img_name: str = None) -> None: + """ + Render a histogram for the given nuScenes split. + :param nusc: A nuScenes object. + :param sort_by: How to sort the classes: + - count_desc: Sort the classes by the number of points belonging to each class, in descending order. + - count_asc: Sort the classes by the number of points belonging to each class, in ascending order. + - name: Sort the classes by alphabetical order. + - index: Sort the classes by their indices. + :param chart_title: Title to display on the histogram. + :param x_label: Title to display on the x-axis of the histogram. + :param y_label: Title to display on the y-axis of the histogram. + :param y_log_scale: Whether to use log scale on the y-axis. + :param verbose: Whether to display plot in a window after rendering. + :param font_size: Size of the font to use for the histogram. + :param save_as_img_name: Path (including image name and extension) to save the histogram as. + """ + + print('Calculating stats for nuScenes-lidarseg...') + start_time = time.time() + + # Get the statistics for the given nuScenes split. + class_names, counts = get_lidarseg_stats(nusc, sort_by=sort_by) + + print('Calculated stats for {} point clouds in {:.1f} seconds.\n====='.format( + len(nusc.lidarseg), time.time() - start_time)) + + # Create an array with the colors to use. + cmap = get_colormap() + colors = ['#%02x%02x%02x' % tuple(cmap[cn]) for cn in class_names] # Convert from RGB to hex. + + # Make the class names shorter so that they do not take up much space in the plot. + class_names = [truncate_class_name(cn) for cn in class_names] + + # Start a plot. + fig, ax = plt.subplots(figsize=(16, 9)) + plt.margins(x=0.005) # Add some padding to the left and right limits of the x-axis for aesthetics. + ax.set_axisbelow(True) # Ensure that axis ticks and gridlines will be below all other ploy elements. + ax.yaxis.grid(color='white', linewidth=2) # Show horizontal gridlines. + ax.set_facecolor('#eaeaf2') # Set background of plot. + ax.spines['top'].set_visible(False) # Remove top border of plot. + ax.spines['right'].set_visible(False) # Remove right border of plot. + ax.spines['bottom'].set_visible(False) # Remove bottom border of plot. + ax.spines['left'].set_visible(False) # Remove left border of plot. + + # Plot the histogram. + ax.bar(class_names, counts, color=colors) + assert len(class_names) == len(ax.get_xticks()), \ + 'There are {} classes, but {} are shown on the x-axis'.format(len(class_names), len(ax.get_xticks())) + + # Format the x-axis. + ax.set_xlabel(x_label, fontsize=font_size) + ax.set_xticklabels(class_names, rotation=45, horizontalalignment='right', + fontweight='light', fontsize=font_size) + + # Shift the class names on the x-axis slightly to the right for aesthetics. + trans = mtrans.Affine2D().translate(10, 0) + for t in ax.get_xticklabels(): + t.set_transform(t.get_transform() + trans) + + # Format the y-axis. + ax.set_ylabel(y_label, fontsize=font_size) + ax.set_yticklabels(counts, size=font_size) + + # Transform the y-axis to log scale. + if y_log_scale: + ax.set_yscale("log") + + # Display the y-axis using nice scientific notation. + formatter = ScalarFormatter(useOffset=False, useMathText=True) + ax.yaxis.set_major_formatter( + FuncFormatter(lambda x, pos: "${}$".format(formatter._formatSciNotation('%1.10e' % x)))) + + if chart_title: + ax.set_title(chart_title, fontsize=font_size) + + if save_as_img_name: + fig = ax.get_figure() + plt.tight_layout() + fig.savefig(save_as_img_name) + + if verbose: + plt.show() + + +def get_lidarseg_stats(nusc: NuScenes, sort_by: str = 'count_desc') -> Tuple[List[str], List[int]]: + """ + Get the number of points belonging to each class for the given nuScenes split. + :param nusc: A NuScenes object. + :param sort_by: How to sort the classes: + - count_desc: Sort the classes by the number of points belonging to each class, in descending order. + - count_asc: Sort the classes by the number of points belonging to each class, in ascending order. + - name: Sort the classes by alphabetical order. + - index: Sort the classes by their indices. + :return: A list of class names and a list of the corresponding number of points for each class. + """ + + # Initialize an array of zeroes, one for each class name. + lidarseg_counts = [0] * len(nusc.lidarseg_idx2name_mapping) + + for record_lidarseg in nusc.lidarseg: + lidarseg_labels_filename = os.path.join(nusc.dataroot, record_lidarseg['filename']) + + points_label = np.fromfile(lidarseg_labels_filename, dtype=np.uint8) + indices = np.bincount(points_label) + ii = np.nonzero(indices)[0] + for class_idx, class_count in zip(ii, indices[ii]): + lidarseg_counts[class_idx] += class_count + + lidarseg_counts_dict = dict() + for i in range(len(lidarseg_counts)): + lidarseg_counts_dict[nusc.lidarseg_idx2name_mapping[i]] = lidarseg_counts[i] + + if sort_by == 'count_desc': + out = sorted(lidarseg_counts_dict.items(), key=lambda item: item[1], reverse=True) + elif sort_by == 'count_asc': + out = sorted(lidarseg_counts_dict.items(), key=lambda item: item[1]) + elif sort_by == 'name': + out = sorted(lidarseg_counts_dict.items()) + elif sort_by == 'index': + out = lidarseg_counts_dict.items() + else: + raise Exception('Error: Invalid sorting mode {}. ' + 'Only `count_desc`, `count_asc`, `name` or `index` are valid.'.format(sort_by)) + + # Get frequency counts of each class in the lidarseg dataset. + class_names = [] + counts = [] + for class_name, count in out: + class_names.append(class_name) + counts.append(count) + + return class_names, counts diff --git a/python-sdk/nuscenes/lidarseg/lidarseg_utils.py b/python-sdk/nuscenes/lidarseg/lidarseg_utils.py new file mode 100644 index 00000000..84520491 --- /dev/null +++ b/python-sdk/nuscenes/lidarseg/lidarseg_utils.py @@ -0,0 +1,218 @@ +# nuScenes dev-kit. +# Code written by Fong Whye Kit, 2020. + +from typing import Dict, Iterable, List, Tuple + +import cv2 +import matplotlib.patches as mpatches +import matplotlib.pyplot as plt +import numpy as np +from matplotlib.backends.backend_agg import FigureCanvasAgg as FigureCanvas + + +def get_stats(points_label: np.array, num_classes: int) -> List[int]: + """ + Get frequency of each label in a point cloud. + :param num_classes: The number of classes. + :param points_label: A numPy array which contains the labels of the point cloud; e.g. np.array([2, 1, 34, ..., 38]) + :return: An array which contains the counts of each label in the point cloud. The index of the point cloud + corresponds to the index of the class label. E.g. [0, 2345, 12, 451] means that there are no points in + class 0, there are 2345 points in class 1, there are 12 points in class 2 etc. + """ + + lidarseg_counts = [0] * num_classes # Create as many bins as there are classes, and initialize all bins as 0. + + indices: np.ndarray = np.bincount(points_label) + ii = np.nonzero(indices)[0] + + for class_idx, class_count in zip(ii, indices[ii]): + lidarseg_counts[class_idx] += class_count # Increment the count for the particular class name. + + return lidarseg_counts + + +def plt_to_cv2(points: np.array, coloring: np.array, im, imsize: Tuple[int, int] = (640, 360), dpi: int = 100): + """ + Converts a scatter plot in matplotlib to an image in cv2. This is useful as cv2 is unable to do + scatter plots. + :param points: A numPy array (of size [2 x num_points] and type float) representing the pointcloud. + :param coloring: A numPy array (of size [num_points] containing the color (in RGB, normalized + between 0 and 1) for each point. + :param im: An image (e.g. a camera view) to put the scatter plot on. + :param imsize: Size of image to render. The larger the slower this will run. + :param dpi: Resolution of the output figure. + :return: cv2 image with the scatter plot. + """ + # Render lidarseg labels in image. + fig = plt.figure(figsize=(imsize[0] / dpi, imsize[1] / dpi), dpi=dpi) + ax = plt.Axes(fig, [0., 0., 1., 1.]) + fig.add_axes(ax) + + ax.axis('off') + ax.margins(0, 0) + + ax.imshow(im) + ax.scatter(points[0, :], points[1, :], c=coloring, s=5) + + # Convert from pyplot to cv2. + canvas = FigureCanvas(fig) + canvas.draw() + mat = np.array(canvas.renderer.buffer_rgba()).astype('uint8') # Put pixel buffer in numpy array. + mat = cv2.cvtColor(mat, cv2.COLOR_RGB2BGR) + mat = cv2.resize(mat, imsize) + + # Clear off the current figure to prevent an accumulation of figures in memory. + plt.close('all') + + return mat + + +def colormap_to_colors(colormap: Dict[str, Iterable[int]], name2idx: Dict[str, int]) -> np.ndarray: + """ + Create an array of RGB values from a colormap. Note that the RGB values are normalized + between 0 and 1, not 0 and 255. + :param colormap: A dictionary containing the mapping from class names to RGB values. + :param name2idx: A dictionary containing the mapping form class names to class index. + :return: An array of colors. + """ + colors = [] + for i, (k, v) in enumerate(colormap.items()): + # Ensure that the indices from the colormap is same as the class indices. + assert i == name2idx[k], 'Error: {} is of index {}, ' \ + 'but it is of index {} in the colormap.'.format(k, name2idx[k], i) + colors.append(v) + + colors = np.array(colors) / 255 # Normalize RGB values to be between 0 and 1 for each channel. + + return colors + + +def filter_colors(colors: np.array, classes_to_display: np.array) -> np.ndarray: + """ + Given an array of RGB colors and a list of classes to display, return a colormap (in RGBA) with the opacity + of the labels to be display set to 1.0 and those to be hidden set to 0.0 + :param colors: [n x 3] array where each row consist of the RGB values for the corresponding class index + :param classes_to_display: An array of classes to display (e.g. [1, 8, 32]). The array need not be ordered. + :return: (colormap ). + + colormap = np.array([[R1, G1, B1], colormap = np.array([[1.0, 1.0, 1.0, 0.0], + [R2, G2, B2], ------> [R2, G2, B2, 1.0], + ..., ..., + Rn, Gn, Bn]]) [1.0, 1.0, 1.0, 0.0]]) + """ + for i in range(len(colors)): + if i not in classes_to_display: + colors[i] = [1.0, 1.0, 1.0] # Mask labels to be hidden with 1.0 in all channels. + + # Convert the RGB colormap to an RGBA array, with the alpha channel set to zero whenever the R, G and B channels + # are all equal to 1.0. + alpha = np.array([~np.all(colors == 1.0, axis=1) * 1.0]) + colors = np.concatenate((colors, alpha.T), axis=1) + + return colors + + +def get_labels_in_coloring(color_legend: np.ndarray, coloring: np.ndarray) -> List[int]: + """ + Find the class labels which are present in a pointcloud which has been projected onto an image. + :param color_legend: A list of arrays in which each array corresponds to the RGB values of a class. + :param coloring: A list of arrays in which each array corresponds to the RGB values of a point in the portion of + the pointcloud projected onto the image. + :return: List of class indices which are present in the image. + """ + + def _array_in_list(arr: List, list_arrays: List) -> bool: + """ + Check if an array is in a list of arrays. + :param: arr: An array. + :param: list_arrays: A list of arrays. + :return: Whether the given array is in the list of arrays. + """ + # Credits: https://stackoverflow.com/questions/23979146/check-if-numpy-array-is-in-list-of-numpy-arrays + return next((True for elem in list_arrays if np.array_equal(elem, arr)), False) + + filter_lidarseg_labels = [] + + # Get only the distinct colors present in the pointcloud so that we will not need to compare each color in + # the color legend with every single point in the pointcloud later. + distinct_colors = list(set(tuple(c) for c in coloring)) + + for i, color in enumerate(color_legend): + if _array_in_list(color, distinct_colors): + filter_lidarseg_labels.append(i) + + return filter_lidarseg_labels + + +def create_lidarseg_legend(labels_to_include_in_legend: List[int], + idx2name: Dict[int, str], name2color: Dict[str, Tuple[int, int, int]], + loc: str = 'upper center', ncol: int = 3, bbox_to_anchor: Tuple = None): + """ + Given a list of class indices, the mapping from class index to class name, and the mapping from class name + to class color, produce a legend which shows the color and the corresponding class name. + :param labels_to_include_in_legend: Labels to show in the legend. + :param idx2name: The mapping from class index to class name. + :param name2color: The mapping from class name to class color. + :param loc: The location of the legend. + :param ncol: The number of columns that the legend has. + :param bbox_to_anchor: A 2-tuple (x, y) which places the top-left corner of the legend specified by loc + at x, y. The origin is at the bottom-left corner and x and y are normalized between + 0 and 1 (i.e. x > 1 and / or y > 1 will place the legend outside the plot. + """ + + recs = [] + classes_final = [] + classes = [name for idx, name in sorted(idx2name.items())] + + for i in range(len(classes)): + if labels_to_include_in_legend is None or i in labels_to_include_in_legend: + name = classes[i] + recs.append(mpatches.Rectangle((0, 0), 1, 1, fc=np.array(name2color[name]) / 255)) + + # Truncate class names to only first 25 chars so that legend is not excessively long. + classes_final.append(classes[i][:25]) + + plt.legend(recs, classes_final, loc=loc, ncol=ncol, bbox_to_anchor=bbox_to_anchor) + + +def paint_points_label(lidarseg_labels_filename: str, filter_lidarseg_labels: List[int], + name2idx: Dict[str, int], colormap: Dict[str, Tuple[int, int, int]]) -> np.ndarray: + """ + Paint each label in a pointcloud with the corresponding RGB value, and set the opacity of the labels to + be shown to 1 (the opacity of the rest will be set to 0); e.g.: + [30, 5, 12, 34, ...] ------> [[R30, G30, B30, 0], [R5, G5, B5, 1], [R34, G34, B34, 1], ...] + :param lidarseg_labels_filename: Path to the .bin file containing the labels. + :param filter_lidarseg_labels: The labels for which to set opacity to zero; this is to hide those points + thereby preventing them from being displayed. + :param name2idx: A dictionary containing the mapping from class names to class indices. + :param colormap: A dictionary containing the mapping from class names to RGB values. + :return: A numpy array which has length equal to the number of points in the pointcloud, and each value is + a RGBA array. + """ + + # Load labels from .bin file. + points_label = np.fromfile(lidarseg_labels_filename, dtype=np.uint8) # [num_points] + + # Given a colormap (class name -> RGB color) and a mapping from class name to class index, + # get an array of RGB values where each color sits at the index in the array corresponding + # to the class index. + colors = colormap_to_colors(colormap, name2idx) # Shape: [num_class, 3] + + if filter_lidarseg_labels is not None: + # Ensure that filter_lidarseg_labels is an iterable. + assert isinstance(filter_lidarseg_labels, (list, np.ndarray)), \ + 'Error: filter_lidarseg_labels should be a list of class indices, eg. [9], [10, 21].' + + # Check that class indices in filter_lidarseg_labels are valid. + assert all([0 <= x < len(name2idx) for x in filter_lidarseg_labels]), \ + 'All class indices in filter_lidarseg_labels should ' \ + 'be between 0 and {}'.format(len(name2idx) - 1) + + # Filter to get only the colors of the desired classes; this is done by setting the + # alpha channel of the classes to be viewed to 1, and the rest to 0. + colors = filter_colors(colors, filter_lidarseg_labels) # Shape: [num_class, 4] + + # Paint each label with its respective RGBA value. + coloring = colors[points_label] # Shape: [num_points, 4] + + return coloring diff --git a/python-sdk/nuscenes/map_expansion/map_api.py b/python-sdk/nuscenes/map_expansion/map_api.py index e69bb77a..5bf887a0 100644 --- a/python-sdk/nuscenes/map_expansion/map_api.py +++ b/python-sdk/nuscenes/map_expansion/map_api.py @@ -912,7 +912,7 @@ def render_record(self, global_ax.legend() # Adds the zoomed in effect to the plot. - mark_inset(global_ax, local_ax, loc1=2, loc2=4, color='black') + mark_inset(global_ax, local_ax, loc1=2, loc2=4) return fig, (global_ax, local_ax) diff --git a/python-sdk/nuscenes/nuscenes.py b/python-sdk/nuscenes/nuscenes.py index dd5425c6..e40eefeb 100644 --- a/python-sdk/nuscenes/nuscenes.py +++ b/python-sdk/nuscenes/nuscenes.py @@ -1,13 +1,14 @@ # nuScenes dev-kit. -# Code written by Oscar Beijbom, 2018. +# Code written by Oscar Beijbom, Holger Caesar & Fong Whye Kit, 2020. import json import math +import os import os.path as osp import sys import time from datetime import datetime -from typing import Tuple, List +from typing import Tuple, List, Iterable import cv2 import matplotlib.pyplot as plt @@ -19,9 +20,12 @@ from pyquaternion import Quaternion from tqdm import tqdm +from nuscenes.lidarseg.lidarseg_utils import colormap_to_colors, plt_to_cv2, get_stats, \ + get_labels_in_coloring, create_lidarseg_legend, paint_points_label from nuscenes.utils.data_classes import LidarPointCloud, RadarPointCloud, Box from nuscenes.utils.geometry_utils import view_points, box_in_image, BoxVisibility, transform_matrix from nuscenes.utils.map_mask import MapMask +from nuscenes.utils.color_map import get_colormap PYTHON_VERSION = sys.version_info[0] @@ -48,6 +52,7 @@ def __init__(self, """ self.version = version self.dataroot = dataroot + self.verbose = verbose self.table_names = ['category', 'attribute', 'visibility', 'instance', 'sensor', 'calibrated_sensor', 'ego_pose', 'log', 'scene', 'sample', 'sample_data', 'sample_annotation', 'map'] @@ -72,6 +77,37 @@ def __init__(self, self.sample_annotation = self.__load_table__('sample_annotation') self.map = self.__load_table__('map') + # Initialize the colormap which maps from class names to RGB values. + self.colormap = get_colormap() + + # If available, also load the lidarseg annotations. + if osp.exists(osp.join(self.table_root, 'lidarseg.json')): + if self.verbose: + print('Loading nuScenes-lidarseg...') + + self.lidarseg = self.__load_table__('lidarseg') + num_lidarseg_recs = len(self.lidarseg) + num_bin_files = len([name for name in os.listdir(os.path.join(self.dataroot, 'lidarseg', self.version)) + if name.endswith('.bin')]) + assert num_lidarseg_recs == num_bin_files, \ + 'Error: There are {} .bin files but {} lidarseg records.'.format(num_bin_files, num_lidarseg_recs) + self.table_names.append('lidarseg') + + # Create mapping from class index to class name, and vice versa, for easy lookup later on. + self.lidarseg_idx2name_mapping = dict() + self.lidarseg_name2idx_mapping = dict() + for lidarseg_category in self.category: + # Check that the category records contain both the keys 'name' and 'index'. + assert 'index' in lidarseg_category.keys(), \ + 'Please use the category.json that comes with nuScenes-lidarseg, and not the old category.json.' + + self.lidarseg_idx2name_mapping[lidarseg_category['index']] = lidarseg_category['name'] + self.lidarseg_name2idx_mapping[lidarseg_category['name']] = lidarseg_category['index'] + + # Sort the colormap to ensure that it is ordered according to the indices in self.category. + self.colormap = dict({c['name']: self.colormap[c['name']] + for c in sorted(self.category, key=lambda k: k['index'])}) + # If available, also load the image_annotations table created by export_2d_annotations_as_json(). if osp.exists(osp.join(self.table_root, 'image_annotations.json')): self.image_annotations = self.__load_table__('image_annotations') @@ -83,12 +119,12 @@ def __init__(self, if verbose: for table in self.table_names: print("{} {},".format(len(getattr(self, table)), table)) - print("Done loading in {:.1f} seconds.\n======".format(time.time() - start_time)) + print("Done loading in {:.3f} seconds.\n======".format(time.time() - start_time)) # Make reverse indexes for common lookups. self.__make_reverse_index__(verbose) - # Initialize NuScenesExplorer class + # Initialize NuScenesExplorer class. self.explorer = NuScenesExplorer(self) @property @@ -376,9 +412,78 @@ def box_velocity(self, sample_annotation_token: str, max_time_diff: float = 1.5) else: return pos_diff / time_diff + def get_sample_lidarseg_stats(self, sample_token: str, sort_by: str = 'count', + lidarseg_preds_bin_path: str = None) -> None: + """ + Print the number of points for each class in the lidar pointcloud of a sample. Classes with have no + points in the pointcloud will not be printed. + :param sample_token: Sample token. + :param sort_by: One of three options: count / name / index. If 'count`, the stats will be printed in + ascending order of frequency; if `name`, the stats will be printed alphabetically + according to class name; if `index`, the stats will be printed in ascending order of + class index. + :param lidarseg_preds_bin_path: A path to the .bin file which contains the user's lidar segmentation + predictions for the sample. + """ + assert hasattr(self, 'lidarseg'), 'Error: You have no lidarseg data; unable to get ' \ + 'statistics for segmentation of the point cloud.' + assert sort_by in ['count', 'name', 'index'], 'Error: sort_by can only be one of the following: ' \ + 'count / name / index.' + + sample_rec = self.get('sample', sample_token) + ref_sd_token = sample_rec['data']['LIDAR_TOP'] + ref_sd_record = self.get('sample_data', ref_sd_token) + + # Ensure that lidar pointcloud is from a keyframe. + assert ref_sd_record['is_key_frame'], 'Error: Only pointclouds which are keyframes have ' \ + 'lidar segmentation labels. Rendering aborted.' + + if lidarseg_preds_bin_path: + lidarseg_labels_filename = lidarseg_preds_bin_path + assert os.path.exists(lidarseg_labels_filename), \ + 'Error: Unable to find {} to load the predictions for sample token {} ' \ + '(lidar sample data token {}) from.'.format(lidarseg_labels_filename, sample_token, ref_sd_token) + + header = '===== Statistics for ' + sample_token + ' (predictions) =====' + else: + assert len(self.lidarseg) > 0, 'Error: There are no ground truth labels found for nuScenes-lidarseg ' \ + 'for {}. Are you loading the test set? \nIf you want to see the sample ' \ + 'statistics for your predictions, pass a path to the appropriate .bin ' \ + 'file using the lidarseg_preds_bin_path argument.'.format(self.version) + lidar_sd_token = self.get('sample', sample_token)['data']['LIDAR_TOP'] + lidarseg_labels_filename = os.path.join(self.dataroot, + self.get('lidarseg', lidar_sd_token)['filename']) + + header = '===== Statistics for ' + sample_token + ' =====' + print(header) + + points_label = np.fromfile(lidarseg_labels_filename, dtype=np.uint8) + lidarseg_counts = get_stats(points_label, len(self.lidarseg_idx2name_mapping)) + + lidarseg_counts_dict = dict() + for i in range(len(lidarseg_counts)): + lidarseg_counts_dict[self.lidarseg_idx2name_mapping[i]] = lidarseg_counts[i] + + if sort_by == 'count': + out = sorted(lidarseg_counts_dict.items(), key=lambda item: item[1]) + elif sort_by == 'name': + out = sorted(lidarseg_counts_dict.items()) + else: + out = lidarseg_counts_dict.items() + + for class_name, count in out: + if count > 0: + idx = self.lidarseg_name2idx_mapping[class_name] + print('{:3} {:40} n={:12,}'.format(idx, class_name, count)) + + print('=' * len(header)) + def list_categories(self) -> None: self.explorer.list_categories() + def list_lidarseg_categories(self, sort_by: str = 'count') -> None: + self.explorer.list_lidarseg_categories(sort_by=sort_by) + def list_attributes(self) -> None: self.explorer.list_attributes() @@ -390,22 +495,45 @@ def list_sample(self, sample_token: str) -> None: def render_pointcloud_in_image(self, sample_token: str, dot_size: int = 5, pointsensor_channel: str = 'LIDAR_TOP', camera_channel: str = 'CAM_FRONT', out_path: str = None, - render_intensity: bool = False) -> None: + render_intensity: bool = False, + show_lidarseg: bool = False, + filter_lidarseg_labels: List = None, + show_lidarseg_legend: bool = False, + verbose: bool = True, + lidarseg_preds_bin_path: str = None) -> None: self.explorer.render_pointcloud_in_image(sample_token, dot_size, pointsensor_channel=pointsensor_channel, camera_channel=camera_channel, out_path=out_path, - render_intensity=render_intensity) + render_intensity=render_intensity, + show_lidarseg=show_lidarseg, + filter_lidarseg_labels=filter_lidarseg_labels, + show_lidarseg_legend=show_lidarseg_legend, + verbose=verbose, + lidarseg_preds_bin_path=lidarseg_preds_bin_path) def render_sample(self, sample_token: str, box_vis_level: BoxVisibility = BoxVisibility.ANY, nsweeps: int = 1, - out_path: str = None) -> None: - self.explorer.render_sample(sample_token, box_vis_level, nsweeps=nsweeps, out_path=out_path) + out_path: str = None, show_lidarseg: bool = False, + filter_lidarseg_labels: List = None, + lidarseg_preds_bin_path: str = None, verbose: bool = True) -> None: + self.explorer.render_sample(sample_token, box_vis_level, nsweeps=nsweeps, + out_path=out_path, show_lidarseg=show_lidarseg, + filter_lidarseg_labels=filter_lidarseg_labels, + lidarseg_preds_bin_path=lidarseg_preds_bin_path, verbose=verbose) def render_sample_data(self, sample_data_token: str, with_anns: bool = True, box_vis_level: BoxVisibility = BoxVisibility.ANY, axes_limit: float = 40, ax: Axes = None, nsweeps: int = 1, out_path: str = None, underlay_map: bool = True, - use_flat_vehicle_coordinates: bool = True) -> None: + use_flat_vehicle_coordinates: bool = True, + show_lidarseg: bool = False, + show_lidarseg_legend: bool = False, + filter_lidarseg_labels: List = None, + lidarseg_preds_bin_path: str = None, verbose: bool = True) -> None: self.explorer.render_sample_data(sample_data_token, with_anns, box_vis_level, axes_limit, ax, nsweeps=nsweeps, out_path=out_path, underlay_map=underlay_map, - use_flat_vehicle_coordinates=use_flat_vehicle_coordinates) + use_flat_vehicle_coordinates=use_flat_vehicle_coordinates, + show_lidarseg=show_lidarseg, + show_lidarseg_legend=show_lidarseg_legend, + filter_lidarseg_labels=filter_lidarseg_labels, + lidarseg_preds_bin_path=lidarseg_preds_bin_path, verbose=verbose) def render_annotation(self, sample_annotation_token: str, margin: float = 10, view: np.ndarray = np.eye(4), box_vis_level: BoxVisibility = BoxVisibility.ANY, out_path: str = None, @@ -428,6 +556,48 @@ def render_scene_channel(self, scene_token: str, channel: str = 'CAM_FRONT', fre def render_egoposes_on_map(self, log_location: str, scene_tokens: List = None, out_path: str = None) -> None: self.explorer.render_egoposes_on_map(log_location, scene_tokens, out_path=out_path) + def render_scene_channel_lidarseg(self, scene_token: str, + channel: str, + out_folder: str = None, + filter_lidarseg_labels: Iterable[int] = None, + with_anns: bool = False, + render_mode: str = None, + verbose: bool = True, + imsize: Tuple[int, int] = (640, 360), + freq: float = 2, + dpi: int = 150, + lidarseg_preds_folder: str = None) -> None: + self.explorer.render_scene_channel_lidarseg(scene_token, + channel, + out_folder=out_folder, + filter_lidarseg_labels=filter_lidarseg_labels, + with_anns=with_anns, + render_mode=render_mode, + verbose=verbose, + imsize=imsize, + freq=freq, + dpi=dpi, + lidarseg_preds_folder=lidarseg_preds_folder) + + def render_scene_lidarseg(self, scene_token: str, + out_path: str = None, + filter_lidarseg_labels: Iterable[int] = None, + with_anns: bool = False, + imsize: Tuple[int, int] = (640, 360), + freq: float = 2, + verbose: bool = True, + dpi: int = 200, + lidarseg_preds_folder: str = None) -> None: + self.explorer.render_scene_lidarseg(scene_token, + out_path=out_path, + filter_lidarseg_labels=filter_lidarseg_labels, + with_anns=with_anns, + imsize=imsize, + freq=freq, + verbose=verbose, + dpi=dpi, + lidarseg_preds_folder=lidarseg_preds_folder) + class NuScenesExplorer: """ Helper class to list and visualize NuScenes data. These are meant to serve as tutorials and templates for @@ -436,35 +606,26 @@ class NuScenesExplorer: def __init__(self, nusc: NuScenes): self.nusc = nusc - @staticmethod - def get_color(category_name: str) -> Tuple[int, int, int]: + def get_color(self, category_name: str) -> Tuple[int, int, int]: """ Provides the default colors based on the category names. This method works for the general nuScenes categories, as well as the nuScenes detection categories. """ - if 'bicycle' in category_name or 'motorcycle' in category_name: - return 255, 61, 99 # Red - elif 'vehicle' in category_name or category_name in ['bus', 'car', 'construction_vehicle', 'trailer', 'truck']: - return 255, 158, 0 # Orange - elif 'pedestrian' in category_name: - return 0, 0, 230 # Blue - elif 'cone' in category_name or 'barrier' in category_name: - return 0, 0, 0 # Black - else: - return 255, 0, 255 # Magenta + + return self.nusc.colormap[category_name] def list_categories(self) -> None: """ Print categories, counts and stats. These stats only cover the split specified in nusc.version. """ print('Category stats for split %s:' % self.nusc.version) - # Add all annotations + # Add all annotations. categories = dict() for record in self.nusc.sample_annotation: if record['category_name'] not in categories: categories[record['category_name']] = [] categories[record['category_name']].append(record['size'] + [record['size'][1] / record['size'][0]]) - # Print stats + # Print stats. for name, stats in sorted(categories.items()): stats = np.array(stats) print('{:27} n={:5}, width={:5.2f}\u00B1{:.2f}, len={:5.2f}\u00B1{:.2f}, height={:5.2f}\u00B1{:.2f}, ' @@ -474,6 +635,53 @@ def list_categories(self) -> None: np.mean(stats[:, 2]), np.std(stats[:, 2]), np.mean(stats[:, 3]), np.std(stats[:, 3]))) + def list_lidarseg_categories(self, sort_by: str = 'count') -> None: + """ + Print categories and counts of the lidarseg data. These stats only cover + the split specified in nusc.version. + :param sort_by: One of three options: count / name / index. If 'count`, the stats will be printed in + ascending order of frequency; if `name`, the stats will be printed alphabetically + according to class name; if `index`, the stats will be printed in ascending order of + class index. + """ + assert hasattr(self.nusc, 'lidarseg'), 'Error: nuScenes-lidarseg not installed!' + assert sort_by in ['count', 'name', 'index'], 'Error: sort_by can only be one of the following: ' \ + 'count / name / index.' + + print('Calculating stats for nuScenes-lidarseg...') + start_time = time.time() + + # Initialize an array of zeroes, one for each class name. + lidarseg_counts = [0] * len(self.nusc.lidarseg_idx2name_mapping) + + for record_lidarseg in self.nusc.lidarseg: + lidarseg_labels_filename = osp.join(self.nusc.dataroot, record_lidarseg['filename']) + + points_label = np.fromfile(lidarseg_labels_filename, dtype=np.uint8) + indices = np.bincount(points_label) + ii = np.nonzero(indices)[0] + for class_idx, class_count in zip(ii, indices[ii]): + lidarseg_counts[class_idx] += class_count + + lidarseg_counts_dict = dict() + for i in range(len(lidarseg_counts)): + lidarseg_counts_dict[self.nusc.lidarseg_idx2name_mapping[i]] = lidarseg_counts[i] + + if sort_by == 'count': + out = sorted(lidarseg_counts_dict.items(), key=lambda item: item[1]) + elif sort_by == 'name': + out = sorted(lidarseg_counts_dict.items()) + else: + out = lidarseg_counts_dict.items() + + # Print frequency counts of each class in the lidarseg dataset. + for class_name, count in out: + idx = self.nusc.lidarseg_name2idx_mapping[class_name] + print('{:3} {:40} nbr_points={:12,}'.format(idx, class_name, count)) + + print('Calculated stats for {} point clouds in {:.1f} seconds.\n====='.format( + len(self.nusc.lidarseg), time.time() - start_time)) + def list_attributes(self) -> None: """ Prints attributes and counts. """ attribute_counts = dict() @@ -533,42 +741,61 @@ def map_pointcloud_to_image(self, pointsensor_token: str, camera_token: str, min_dist: float = 1.0, - render_intensity: bool = False) -> Tuple: + render_intensity: bool = False, + show_lidarseg: bool = False, + filter_lidarseg_labels: List = None, + lidarseg_preds_bin_path: str = None) -> Tuple: """ - Given a point sensor (lidar/radar) token and camera sample_data token, load point-cloud and map it to the image + Given a point sensor (lidar/radar) token and camera sample_data token, load pointcloud and map it to the image plane. :param pointsensor_token: Lidar/radar sample_data token. :param camera_token: Camera sample_data token. :param min_dist: Distance from the camera below which points are discarded. :param render_intensity: Whether to render lidar intensity instead of point depth. + :param show_lidarseg: Whether to render lidar intensity instead of point depth. + :param filter_lidarseg_labels: Only show lidar points which belong to the given list of classes. If None + or the list is empty, all classes will be displayed. + :param lidarseg_preds_bin_path: A path to the .bin file which contains the user's lidar segmentation + predictions for the sample. :return (pointcloud , coloring , image ). """ + cam = self.nusc.get('sample_data', camera_token) pointsensor = self.nusc.get('sample_data', pointsensor_token) pcl_path = osp.join(self.nusc.dataroot, pointsensor['filename']) if pointsensor['sensor_modality'] == 'lidar': + if show_lidarseg: + assert hasattr(self.nusc, 'lidarseg'), 'Error: nuScenes-lidarseg not installed!' + + # Ensure that lidar pointcloud is from a keyframe. + assert pointsensor['is_key_frame'], \ + 'Error: Only pointclouds which are keyframes have lidar segmentation labels. Rendering aborted.' + + assert not render_intensity, 'Error: Invalid options selected. You can only select either ' \ + 'render_intensity or show_lidarseg, not both.' + pc = LidarPointCloud.from_file(pcl_path) else: pc = RadarPointCloud.from_file(pcl_path) im = Image.open(osp.join(self.nusc.dataroot, cam['filename'])) # Points live in the point sensor frame. So they need to be transformed via global to the image plane. - # First step: transform the point-cloud to the ego vehicle frame for the timestamp of the sweep. + # First step: transform the pointcloud to the ego vehicle frame for the timestamp of the sweep. cs_record = self.nusc.get('calibrated_sensor', pointsensor['calibrated_sensor_token']) pc.rotate(Quaternion(cs_record['rotation']).rotation_matrix) pc.translate(np.array(cs_record['translation'])) - # Second step: transform to the global frame. + # Second step: transform from ego to the global frame. poserecord = self.nusc.get('ego_pose', pointsensor['ego_pose_token']) pc.rotate(Quaternion(poserecord['rotation']).rotation_matrix) pc.translate(np.array(poserecord['translation'])) - # Third step: transform into the ego vehicle frame for the timestamp of the image. + # Third step: transform from global into the ego vehicle frame for the timestamp of the image. poserecord = self.nusc.get('ego_pose', cam['ego_pose_token']) pc.translate(-np.array(poserecord['translation'])) pc.rotate(Quaternion(poserecord['rotation']).rotation_matrix.T) - # Fourth step: transform into the camera. + # Fourth step: transform from ego into the camera. cs_record = self.nusc.get('calibrated_sensor', cam['calibrated_sensor_token']) pc.translate(-np.array(cs_record['translation'])) pc.rotate(Quaternion(cs_record['rotation']).rotation_matrix.T) @@ -578,7 +805,8 @@ def map_pointcloud_to_image(self, depths = pc.points[2, :] if render_intensity: - assert pointsensor['sensor_modality'] == 'lidar', 'Error: Can only render intensity for lidar!' + assert pointsensor['sensor_modality'] == 'lidar', 'Error: Can only render intensity for lidar, ' \ + 'not %s!' % pointsensor['sensor_modality'] # Retrieve the color from the intensities. # Performs arbitary scaling to achieve more visually pleasing results. intensities = pc.points[3, :] @@ -586,6 +814,31 @@ def map_pointcloud_to_image(self, intensities = intensities ** 0.1 intensities = np.maximum(0, intensities - 0.5) coloring = intensities + elif show_lidarseg: + assert pointsensor['sensor_modality'] == 'lidar', 'Error: Can only render lidarseg labels for lidar, ' \ + 'not %s!' % pointsensor['sensor_modality'] + + if lidarseg_preds_bin_path: + sample_token = self.nusc.get('sample_data', pointsensor_token)['sample_token'] + lidarseg_labels_filename = lidarseg_preds_bin_path + assert os.path.exists(lidarseg_labels_filename), \ + 'Error: Unable to find {} to load the predictions for sample token {} (lidar ' \ + 'sample data token {}) from.'.format(lidarseg_labels_filename, sample_token, pointsensor_token) + else: + if len(self.nusc.lidarseg) > 0: # Ensure lidarseg.json is not empty (e.g. in case of v1.0-test). + lidarseg_labels_filename = osp.join(self.nusc.dataroot, + self.nusc.get('lidarseg', pointsensor_token)['filename']) + else: + lidarseg_labels_filename = None + + if lidarseg_labels_filename: + # Paint each label in the pointcloud with a RGBA value. + coloring = paint_points_label(lidarseg_labels_filename, filter_lidarseg_labels, + self.nusc.lidarseg_name2idx_mapping, self.nusc.colormap) + else: + coloring = depths + print('Warning: There are no lidarseg labels in {}. Points will be colored according to distance ' + 'from the ego vehicle instead.'.format(self.nusc.version)) else: # Retrieve the color from the depth. coloring = depths @@ -613,15 +866,29 @@ def render_pointcloud_in_image(self, pointsensor_channel: str = 'LIDAR_TOP', camera_channel: str = 'CAM_FRONT', out_path: str = None, - render_intensity: bool = False) -> None: + render_intensity: bool = False, + show_lidarseg: bool = False, + filter_lidarseg_labels: List = None, + ax: Axes = None, + show_lidarseg_legend: bool = False, + verbose: bool = True, + lidarseg_preds_bin_path: str = None): """ - Scatter-plots a point-cloud on top of image. + Scatter-plots a pointcloud on top of image. :param sample_token: Sample token. :param dot_size: Scatter plot dot size. :param pointsensor_channel: RADAR or LIDAR channel name, e.g. 'LIDAR_TOP'. :param camera_channel: Camera channel name, e.g. 'CAM_FRONT'. :param out_path: Optional path to save the rendered figure to disk. :param render_intensity: Whether to render lidar intensity instead of point depth. + :param show_lidarseg: Whether to render lidarseg labels instead of point depth. + :param filter_lidarseg_labels: Only show lidar points which belong to the given list of classes. + :param ax: Axes onto which to render. + :param show_lidarseg_legend: Whether to display the legend for the lidarseg labels in the frame. + :param verbose: Whether to display the image in a window. + :param lidarseg_preds_bin_path: A path to the .bin file which contains the user's lidar segmentation + predictions for the sample. + """ sample_record = self.nusc.get('sample', sample_token) @@ -630,64 +897,137 @@ def render_pointcloud_in_image(self, camera_token = sample_record['data'][camera_channel] points, coloring, im = self.map_pointcloud_to_image(pointsensor_token, camera_token, - render_intensity=render_intensity) - plt.figure(figsize=(9, 16)) - plt.imshow(im) - plt.scatter(points[0, :], points[1, :], c=coloring, s=dot_size) - plt.axis('off') + render_intensity=render_intensity, + show_lidarseg=show_lidarseg, + filter_lidarseg_labels=filter_lidarseg_labels, + lidarseg_preds_bin_path=lidarseg_preds_bin_path) + + # Init axes. + if ax is None: + fig, ax = plt.subplots(1, 1, figsize=(9, 16)) + if lidarseg_preds_bin_path: + fig.canvas.set_window_title(sample_token + '(predictions)') + else: + fig.canvas.set_window_title(sample_token) + else: # Set title on if rendering as part of render_sample. + ax.set_title(camera_channel) + ax.imshow(im) + ax.scatter(points[0, :], points[1, :], c=coloring, s=dot_size) + ax.axis('off') + + # Produce a legend with the unique colors from the scatter. + if pointsensor_channel == 'LIDAR_TOP' and show_lidarseg and show_lidarseg_legend: + # Since the labels are stored as class indices, we get the RGB colors from the colormap in an array where + # the position of the RGB color corresponds to the index of the class it represents. + color_legend = colormap_to_colors(self.nusc.colormap, self.nusc.lidarseg_name2idx_mapping) + + # If user does not specify a filter, then set the filter to contain the classes present in the pointcloud + # after it has been projected onto the image; this will allow displaying the legend only for classes which + # are present in the image (instead of all the classes). + if filter_lidarseg_labels is None: + filter_lidarseg_labels = get_labels_in_coloring(color_legend, coloring) + + create_lidarseg_legend(filter_lidarseg_labels, + self.nusc.lidarseg_idx2name_mapping, self.nusc.colormap) if out_path is not None: - plt.savefig(out_path) + plt.savefig(out_path, bbox_inches='tight', pad_inches=0, dpi=200) + if verbose: + plt.show() def render_sample(self, token: str, box_vis_level: BoxVisibility = BoxVisibility.ANY, nsweeps: int = 1, - out_path: str = None) -> None: + out_path: str = None, + show_lidarseg: bool = False, + filter_lidarseg_labels: List = None, + lidarseg_preds_bin_path: str = None, + verbose: bool = True) -> None: """ Render all LIDAR and camera sample_data in sample along with annotations. :param token: Sample token. :param box_vis_level: If sample_data is an image, this sets required visibility for boxes. :param nsweeps: Number of sweeps for lidar and radar. :param out_path: Optional path to save the rendered figure to disk. + :param show_lidarseg: Whether to show lidar segmentations labels or not. + :param filter_lidarseg_labels: Only show lidar points which belong to the given list of classes. + :param lidarseg_preds_bin_path: A path to the .bin file which contains the user's lidar segmentation + predictions for the sample. + :param verbose: Whether to show the rendered sample in a window or not. """ record = self.nusc.get('sample', token) # Separate RADAR from LIDAR and vision. radar_data = {} - nonradar_data = {} + camera_data = {} + lidar_data = {} for channel, token in record['data'].items(): sd_record = self.nusc.get('sample_data', token) sensor_modality = sd_record['sensor_modality'] - if sensor_modality in ['lidar', 'camera']: - nonradar_data[channel] = token + + if sensor_modality == 'camera': + camera_data[channel] = token + elif sensor_modality == 'lidar': + lidar_data[channel] = token else: radar_data[channel] = token # Create plots. num_radar_plots = 1 if len(radar_data) > 0 else 0 - n = num_radar_plots + len(nonradar_data) + num_lidar_plots = 1 if len(lidar_data) > 0 else 0 + n = num_radar_plots + len(camera_data) + num_lidar_plots cols = 2 - fig, axes = plt.subplots(int(np.ceil(n/cols)), cols, figsize=(16, 24)) + fig, axes = plt.subplots(int(np.ceil(n / cols)), cols, figsize=(16, 24)) # Plot radars into a single subplot. if len(radar_data) > 0: ax = axes[0, 0] for i, (_, sd_token) in enumerate(radar_data.items()): - self.render_sample_data(sd_token, with_anns=i == 0, box_vis_level=box_vis_level, ax=ax, nsweeps=nsweeps) + self.render_sample_data(sd_token, with_anns=i == 0, box_vis_level=box_vis_level, ax=ax, nsweeps=nsweeps, + verbose=False) ax.set_title('Fused RADARs') - # Plot camera and lidar in separate subplots. - for (_, sd_token), ax in zip(nonradar_data.items(), axes.flatten()[num_radar_plots:]): - self.render_sample_data(sd_token, box_vis_level=box_vis_level, ax=ax, nsweeps=nsweeps) + # Plot lidar into a single subplot. + if len(lidar_data) > 0: + for (_, sd_token), ax in zip(lidar_data.items(), axes.flatten()[num_radar_plots:]): + self.render_sample_data(sd_token, box_vis_level=box_vis_level, ax=ax, nsweeps=nsweeps, + show_lidarseg=show_lidarseg, + filter_lidarseg_labels=filter_lidarseg_labels, + lidarseg_preds_bin_path=lidarseg_preds_bin_path, + verbose=False) + + # Plot cameras in separate subplots. + for (_, sd_token), ax in zip(camera_data.items(), axes.flatten()[num_radar_plots + num_lidar_plots:]): + if not show_lidarseg: + self.render_sample_data(sd_token, box_vis_level=box_vis_level, ax=ax, nsweeps=nsweeps, + show_lidarseg=False, verbose=False) + else: + sd_record = self.nusc.get('sample_data', sd_token) + sensor_channel = sd_record['channel'] + valid_channels = ['CAM_FRONT_LEFT', 'CAM_FRONT', 'CAM_FRONT_RIGHT', + 'CAM_BACK_LEFT', 'CAM_BACK', 'CAM_BACK_RIGHT'] + assert sensor_channel in valid_channels, 'Input camera channel {} not valid.'.format(sensor_channel) + + self.render_pointcloud_in_image(record['token'], + pointsensor_channel='LIDAR_TOP', + camera_channel=sensor_channel, + show_lidarseg=show_lidarseg, + filter_lidarseg_labels=filter_lidarseg_labels, + ax=ax, verbose=False, + lidarseg_preds_bin_path=lidarseg_preds_bin_path) # Change plot settings and write to disk. axes.flatten()[-1].axis('off') plt.tight_layout() fig.subplots_adjust(wspace=0, hspace=0) + if out_path is not None: plt.savefig(out_path) + if verbose: + plt.show() + def render_ego_centric_map(self, sample_data_token: str, axes_limit: float = 40, @@ -732,7 +1072,7 @@ def crop_image(image: np.array, yaw_deg = -math.degrees(ypr_rad[0]) rotated_cropped = np.array(Image.fromarray(cropped).rotate(yaw_deg)) - # Cop image. + # Crop image. ego_centric_map = crop_image(rotated_cropped, rotated_cropped.shape[1] / 2, rotated_cropped.shape[0] / 2, scaled_limit_px) @@ -755,21 +1095,34 @@ def render_sample_data(self, nsweeps: int = 1, out_path: str = None, underlay_map: bool = True, - use_flat_vehicle_coordinates: bool = True) -> None: + use_flat_vehicle_coordinates: bool = True, + show_lidarseg: bool = False, + show_lidarseg_legend: bool = False, + filter_lidarseg_labels: List = None, + lidarseg_preds_bin_path: str = None, + verbose: bool = True) -> None: """ Render sample data onto axis. :param sample_data_token: Sample_data token. - :param with_anns: Whether to draw annotations. + :param with_anns: Whether to draw box annotations. :param box_vis_level: If sample_data is an image, this sets required visibility for boxes. :param axes_limit: Axes limit for lidar and radar (measured in meters). :param ax: Axes onto which to render. :param nsweeps: Number of sweeps for lidar and radar. :param out_path: Optional path to save the rendered figure to disk. - :param underlay_map: When set to true, LIDAR data is plotted onto the map. This can be slow. + :param underlay_map: When set to true, lidar data is plotted onto the map. This can be slow. :param use_flat_vehicle_coordinates: Instead of the current sensor's coordinate frame, use ego frame which is aligned to z-plane in the world. Note: Previously this method did not use flat vehicle coordinates, which can lead to small errors when the vertical axis of the global frame and lidar are not aligned. The new setting is more correct and rotates the plot by ~90 degrees. + :param show_lidarseg: When set to True, the lidar data is colored with the segmentation labels. When set + to False, the colors of the lidar data represent the distance from the center of the ego vehicle. + :param show_lidarseg_legend: Whether to display the legend for the lidarseg labels in the frame. + :param filter_lidarseg_labels: Only show lidar points which belong to the given list of classes. If None + or the list is empty, all classes will be displayed. + :param lidarseg_preds_bin_path: A path to the .bin file which contains the user's lidar segmentation + predictions for the sample. + :param verbose: Whether to display the image after it is rendered. """ # Get sensor modality. sd_record = self.nusc.get('sample_data', sample_data_token) @@ -783,8 +1136,24 @@ def render_sample_data(self, ref_sd_record = self.nusc.get('sample_data', ref_sd_token) if sensor_modality == 'lidar': - # Get aggregated lidar point cloud in lidar frame. - pc, times = LidarPointCloud.from_file_multisweep(self.nusc, sample_rec, chan, ref_chan, nsweeps=nsweeps) + if show_lidarseg: + assert hasattr(self.nusc, 'lidarseg'), 'Error: nuScenes-lidarseg not installed!' + + # Ensure that lidar pointcloud is from a keyframe. + assert sd_record['is_key_frame'], \ + 'Error: Only pointclouds which are keyframes have lidar segmentation labels. Rendering aborted.' + + assert nsweeps == 1, \ + 'Error: Only pointclouds which are keyframes have lidar segmentation labels; nsweeps should ' \ + 'be set to 1.' + + # Load a single lidar point cloud. + pcl_path = osp.join(self.nusc.dataroot, ref_sd_record['filename']) + pc = LidarPointCloud.from_file(pcl_path) + else: + # Get aggregated lidar point cloud in lidar frame. + pc, times = LidarPointCloud.from_file_multisweep(self.nusc, sample_rec, chan, ref_chan, + nsweeps=nsweeps) velocities = None else: # Get aggregated radar point cloud in reference frame. @@ -835,8 +1204,49 @@ def render_sample_data(self, # Show point cloud. points = view_points(pc.points[:3, :], viewpoint, normalize=False) dists = np.sqrt(np.sum(pc.points[:2, :] ** 2, axis=0)) - colors = np.minimum(1, dists / axes_limit / np.sqrt(2)) + if sensor_modality == 'lidar' and show_lidarseg: + # Load labels for pointcloud. + if lidarseg_preds_bin_path: + sample_token = self.nusc.get('sample_data', sample_data_token)['sample_token'] + lidarseg_labels_filename = lidarseg_preds_bin_path + assert os.path.exists(lidarseg_labels_filename), \ + 'Error: Unable to find {} to load the predictions for sample token {} (lidar ' \ + 'sample data token {}) from.'.format(lidarseg_labels_filename, sample_token, sample_data_token) + else: + if len(self.nusc.lidarseg) > 0: # Ensure lidarseg.json is not empty (e.g. in case of v1.0-test). + lidarseg_labels_filename = osp.join(self.nusc.dataroot, + self.nusc.get('lidarseg', sample_data_token)['filename']) + else: + lidarseg_labels_filename = None + + if lidarseg_labels_filename: + # Paint each label in the pointcloud with a RGBA value. + colors = paint_points_label(lidarseg_labels_filename, filter_lidarseg_labels, + self.nusc.lidarseg_name2idx_mapping, self.nusc.colormap) + + if show_lidarseg_legend: + # Since the labels are stored as class indices, we get the RGB colors from the colormap + # in an array where the position of the RGB color corresponds to the index of the class + # it represents. + color_legend = colormap_to_colors(self.nusc.colormap, self.nusc.lidarseg_name2idx_mapping) + + # If user does not specify a filter, then set the filter to contain the classes present in + # the pointcloud after it has been projected onto the image; this will allow displaying the + # legend only for classes which are present in the image (instead of all the classes). + if filter_lidarseg_labels is None: + filter_lidarseg_labels = get_labels_in_coloring(color_legend, colors) + + create_lidarseg_legend(filter_lidarseg_labels, + self.nusc.lidarseg_idx2name_mapping, self.nusc.colormap, + loc='upper left', ncol=1, bbox_to_anchor=(1.05, 1.0)) + else: + colors = np.minimum(1, dists / axes_limit / np.sqrt(2)) + print('Warning: There are no lidarseg labels in {}. Points will be colored according to distance ' + 'from the ego vehicle instead.'.format(self.nusc.version)) + else: + colors = np.minimum(1, dists / axes_limit / np.sqrt(2)) point_scale = 0.2 if sensor_modality == 'lidar' else 3.0 + scatter = ax.scatter(points[0, :], points[1, :], c=colors, s=point_scale) # Show velocities. @@ -866,7 +1276,6 @@ def render_sample_data(self, # Limit visible range. ax.set_xlim(-axes_limit, axes_limit) ax.set_ylim(-axes_limit, axes_limit) - elif sensor_modality == 'camera': # Load boxes and image. data_path, boxes, camera_intrinsic = self.nusc.get_sample_data(sample_data_token, @@ -894,11 +1303,15 @@ def render_sample_data(self, raise ValueError("Error: Unknown sensor modality!") ax.axis('off') - ax.set_title(sd_record['channel']) + ax.set_title('{} {labels_type}'.format( + sd_record['channel'], labels_type='(predictions)' if lidarseg_preds_bin_path else '')) ax.set_aspect('equal') if out_path is not None: - plt.savefig(out_path) + plt.savefig(out_path, bbox_inches='tight', pad_inches=0, dpi=200) + + if verbose: + plt.show() def render_annotation(self, anntoken: str, @@ -918,11 +1331,11 @@ def render_annotation(self, """ ann_record = self.nusc.get('sample_annotation', anntoken) sample_record = self.nusc.get('sample', ann_record['sample_token']) - assert 'LIDAR_TOP' in sample_record['data'].keys(), 'No LIDAR_TOP in data, cant render' + assert 'LIDAR_TOP' in sample_record['data'].keys(), 'Error: No LIDAR_TOP in data, unable to render.' fig, axes = plt.subplots(1, 2, figsize=(18, 9)) - # Figure out which camera the object is fully visible in (this may return nothing) + # Figure out which camera the object is fully visible in (this may return nothing). boxes, cam = [], [] cams = [key for key in sample_record['data'].keys() if 'CAM' in key] for cam in cams: @@ -930,12 +1343,13 @@ def render_annotation(self, selected_anntokens=[anntoken]) if len(boxes) > 0: break # We found an image that matches. Let's abort. - assert len(boxes) > 0, "Could not find image where annotation is visible. Try using e.g. BoxVisibility.ANY." - assert len(boxes) < 2, "Found multiple annotations. Something is wrong!" + assert len(boxes) > 0, 'Error: Could not find image where annotation is visible. ' \ + 'Try using e.g. BoxVisibility.ANY.' + assert len(boxes) < 2, 'Error: Found multiple annotations. Something is wrong!' cam = sample_record['data'][cam] - # Plot LIDAR view + # Plot LIDAR view. lidar = sample_record['data']['LIDAR_TOP'] data_path, boxes, camera_intrinsic = self.nusc.get_sample_data(lidar, selected_anntokens=[anntoken]) LidarPointCloud.from_file(data_path).render_height(axes[0], view=view) @@ -948,7 +1362,7 @@ def render_annotation(self, axes[0].axis('off') axes[0].set_aspect('equal') - # Plot CAMERA view + # Plot CAMERA view. data_path, boxes, camera_intrinsic = self.nusc.get_sample_data(cam, selected_anntokens=[anntoken]) im = Image.open(data_path) axes[1].imshow(im) @@ -1041,7 +1455,7 @@ def render_scene(self, first_sample_rec = self.nusc.get('sample', scene_rec['first_sample_token']) last_sample_rec = self.nusc.get('sample', scene_rec['last_sample_token']) - # Set some display parameters + # Set some display parameters. layout = { 'CAM_FRONT_LEFT': (0, 0), 'CAM_FRONT': (imsize[0], 0), @@ -1066,7 +1480,7 @@ def render_scene(self, else: out = None - # Load first sample_data record for each channel + # Load first sample_data record for each channel. current_recs = {} # Holds the current record to be displayed by channel. prev_recs = {} # Hold the previous displayed record by channel. for channel in layout: @@ -1095,7 +1509,7 @@ def render_scene(self, impath, boxes, camera_intrinsic = self.nusc.get_sample_data(sd_rec['token'], box_vis_level=BoxVisibility.ANY) - # Load and render + # Load and render. if not osp.exists(impath): raise Exception('Error: Missing image %s' % impath) im = cv2.imread(impath) @@ -1107,8 +1521,10 @@ def render_scene(self, if channel in horizontal_flip: im = im[:, ::-1, :] - canvas[layout[channel][1]: layout[channel][1] + imsize[1], - layout[channel][0]:layout[channel][0] + imsize[0], :] = im + canvas[ + layout[channel][1]: layout[channel][1] + imsize[1], + layout[channel][0]:layout[channel][0] + imsize[0], : + ] = im prev_recs[channel] = sd_rec # Store here so we don't render the same image twice. @@ -1144,22 +1560,21 @@ def render_scene_channel(self, :param imsize: Size of image to render. The larger the slower this will run. :param out_path: Optional path to write a video file of the rendered frames. """ - valid_channels = ['CAM_FRONT_LEFT', 'CAM_FRONT', 'CAM_FRONT_RIGHT', 'CAM_BACK_LEFT', 'CAM_BACK', 'CAM_BACK_RIGHT'] - assert imsize[0] / imsize[1] == 16 / 9, "Aspect ratio should be 16/9." - assert channel in valid_channels, 'Input channel {} not valid.'.format(channel) + assert imsize[0] / imsize[1] == 16 / 9, "Error: Aspect ratio should be 16/9." + assert channel in valid_channels, 'Error: Input channel {} not valid.'.format(channel) if out_path is not None: assert osp.splitext(out_path)[-1] == '.avi' - # Get records from DB + # Get records from DB. scene_rec = self.nusc.get('scene', scene_token) sample_rec = self.nusc.get('sample', scene_rec['first_sample_token']) sd_rec = self.nusc.get('sample_data', sample_rec['data'][channel]) - # Open CV init + # Open CV init. name = '{}: {} (Space to pause, ESC to exit)'.format(scene_rec['name'], channel) cv2.namedWindow(name) cv2.moveWindow(name, 0, 0) @@ -1173,11 +1588,11 @@ def render_scene_channel(self, has_more_frames = True while has_more_frames: - # Get data from DB + # Get data from DB. impath, boxes, camera_intrinsic = self.nusc.get_sample_data(sd_rec['token'], box_vis_level=BoxVisibility.ANY) - # Load and render + # Load and render. if not osp.exists(impath): raise Exception('Error: Missing image %s' % impath) im = cv2.imread(impath) @@ -1185,7 +1600,7 @@ def render_scene_channel(self, c = self.get_color(box.name) box.render_cv2(im, view=camera_intrinsic, normalize=True, colors=(c, c, c)) - # Render + # Render. im = cv2.resize(im, imsize) cv2.imshow(name, im) if out_path is not None: @@ -1195,7 +1610,7 @@ def render_scene_channel(self, if key == 32: # If space is pressed, pause. key = cv2.waitKey() - if key == 27: # if ESC is pressed, exit + if key == 27: # If ESC is pressed, exit. cv2.destroyAllWindows() break @@ -1225,11 +1640,11 @@ def render_egoposes_on_map(self, :param color_bg: Color of the non-semantic prior in RGB format (ignored if map is RGB). :param out_path: Optional path to save the rendered figure to disk. """ - # Get logs by location - log_tokens = [l['token'] for l in self.nusc.log if l['location'] == log_location] + # Get logs by location. + log_tokens = [log['token'] for log in self.nusc.log if log['location'] == log_location] assert len(log_tokens) > 0, 'Error: This split has 0 scenes for location %s!' % log_location - # Filter scenes + # Filter scenes. scene_tokens_location = [e['token'] for e in self.nusc.scene if e['log_token'] in log_tokens] if scene_tokens is not None: scene_tokens_location = [t for t in scene_tokens_location if t in scene_tokens] @@ -1257,7 +1672,7 @@ def render_egoposes_on_map(self, sample_data_record = self.nusc.get('sample_data', sample_record['data']['LIDAR_TOP']) pose_record = self.nusc.get('ego_pose', sample_data_record['ego_pose_token']) - # Calculate the pose on the map and append + # Calculate the pose on the map and append. map_poses.append(np.concatenate( map_mask.to_pixel_coords(pose_record['translation'][0], pose_record['translation'][1]))) @@ -1300,3 +1715,357 @@ def render_egoposes_on_map(self, if out_path is not None: plt.savefig(out_path) + + def _plot_points_and_bboxes(self, + pointsensor_token: str, + camera_token: str, + filter_lidarseg_labels: Iterable[int] = None, + lidarseg_preds_bin_path: str = None, + with_anns: bool = False, + imsize: Tuple[int, int] = (640, 360), + dpi: int = 100, + line_width: int = 5) -> Tuple[np.ndarray, bool]: + """ + Projects a pointcloud into a camera image along with the lidarseg labels. There is an option to plot the + bounding boxes as well. + :param pointsensor_token: Token of lidar sensor to render points from and lidarseg labels. + :param camera_token: Token of camera to render image from. + :param filter_lidarseg_labels: Only show lidar points which belong to the given list of classes. If None + or the list is empty, all classes will be displayed. + :param lidarseg_preds_bin_path: A path to the .bin file which contains the user's lidar segmentation + predictions for the sample. + :param with_anns: Whether to draw box annotations. + :param imsize: Size of image to render. The larger the slower this will run. + :param dpi: Resolution of the output figure. + :param line_width: Line width of bounding boxes. + :return: An image with the projected pointcloud, lidarseg labels and (if applicable) the bounding boxes. Also, + whether there are any lidarseg points (after the filter has been applied) in the image. + """ + points, coloring, im = self.map_pointcloud_to_image(pointsensor_token, camera_token, + render_intensity=False, + show_lidarseg=True, + filter_lidarseg_labels=filter_lidarseg_labels, + lidarseg_preds_bin_path=lidarseg_preds_bin_path) + + # Prevent rendering images which have no lidarseg labels in them (e.g. the classes in the filter chosen by + # the users do not appear within the image). To check if there are no lidarseg labels belonging to the desired + # classes in an image, we check if any column in the coloring is all zeros (the alpha column will be all + # zeroes if so). + if (~coloring.any(axis=0)).any(): + no_points_in_im = True + else: + no_points_in_im = False + + if with_anns: + # Get annotations and params from DB. + impath, boxes, camera_intrinsic = self.nusc.get_sample_data(camera_token, box_vis_level=BoxVisibility.ANY) + + # We need to get the image's original height and width as the boxes returned by get_sample_data + # are scaled wrt to that. + h, w, c = cv2.imread(impath).shape + + # Place the projected pointcloud and lidarseg labels onto the image. + mat = plt_to_cv2(points, coloring, im, (w, h), dpi=dpi) + + # Plot each box onto the image. + for box in boxes: + # If a filter is set, and the class of the box is not among the classes that the user wants to see, + # then we skip plotting the box. + if filter_lidarseg_labels is not None and \ + self.nusc.lidarseg_name2idx_mapping[box.name] not in filter_lidarseg_labels: + continue + c = self.get_color(box.name) + box.render_cv2(mat, view=camera_intrinsic, normalize=True, colors=(c, c, c), linewidth=line_width) + + # Only after points and boxes have been placed in the image, then we resize (this is to prevent + # weird scaling issues where the dots and boxes are not of the same scale). + mat = cv2.resize(mat, imsize) + else: + mat = plt_to_cv2(points, coloring, im, imsize, dpi=dpi) + + return mat, no_points_in_im + + def render_scene_channel_lidarseg(self, + scene_token: str, + channel: str, + out_folder: str = None, + filter_lidarseg_labels: Iterable[int] = None, + render_mode: str = None, + verbose: bool = True, + imsize: Tuple[int, int] = (640, 360), + with_anns: bool = False, + freq: float = 2, + dpi: int = 150, + lidarseg_preds_folder: str = None) -> None: + """ + Renders a full scene with labelled lidar pointclouds for a particular camera channel. + The scene can be rendered either to a video or to a set of images. + :param scene_token: Unique identifier of scene to render. + :param channel: Camera channel to render. + :param out_folder: Optional path to save the rendered frames to disk, either as a video or as individual images. + :param filter_lidarseg_labels: Only show lidar points which belong to the given list of classes. If None + or the list is empty, all classes will be displayed. + :param render_mode: Either 'video' or 'image'. 'video' will render the frames into a video (the name of the + video will follow this format: _.avi) while 'image' will + render the frames into individual images (each image name wil follow this format: + __.jpg). 'out_folder' must be specified + to save the video / images. + :param verbose: Whether to show the frames as they are being rendered. + :param imsize: Size of image to render. The larger the slower this will run. + :param with_anns: Whether to draw box annotations. + :param freq: Display frequency (Hz). + :param dpi: Resolution of the output dots. + :param lidarseg_preds_folder: A path to the folder which contains the user's lidar segmentation predictions for + the scene. The naming convention of each .bin file in the folder should be + named in this format: _lidarseg.bin. + """ + + assert hasattr(self.nusc, 'lidarseg'), 'Error: nuScenes-lidarseg not installed!' + + valid_channels = ['CAM_FRONT_LEFT', 'CAM_FRONT', 'CAM_FRONT_RIGHT', + 'CAM_BACK_LEFT', 'CAM_BACK', 'CAM_BACK_RIGHT'] + assert channel in valid_channels, 'Error: Input camera channel {} not valid.'.format(channel) + assert imsize[0] / imsize[1] == 16 / 9, 'Error: Aspect ratio should be 16/9.' + + if lidarseg_preds_folder: + assert(os.path.isdir(lidarseg_preds_folder)), \ + 'Error: The lidarseg predictions folder ({}) does not exist.'.format(lidarseg_preds_folder) + + save_as_vid = False + if out_folder: + assert render_mode in ['video', 'image'], 'Error: For the renderings to be saved to {}, either `video` ' \ + 'or `image` must be specified for render_mode. {} is ' \ + 'not a valid mode.'.format(out_folder, render_mode) + assert os.path.isdir(out_folder), 'Error: {} does not exist.'.format(out_folder) + if render_mode == 'video': + save_as_vid = True + + scene_record = self.nusc.get('scene', scene_token) + + total_num_samples = scene_record['nbr_samples'] + first_sample_token = scene_record['first_sample_token'] + last_sample_token = scene_record['last_sample_token'] + + current_token = first_sample_token + keep_looping = True + i = 0 + + # Open CV init. + if verbose: + name = '{}: {} {labels_type} (Space to pause, ESC to exit)'.format( + scene_record['name'], channel, labels_type="(predictions)" if lidarseg_preds_folder else "") + cv2.namedWindow(name) + cv2.moveWindow(name, 0, 0) + else: + name = None + + if save_as_vid: + out_path = os.path.join(out_folder, scene_record['name'] + '_' + channel + '.avi') + fourcc = cv2.VideoWriter_fourcc(*'MJPG') + out = cv2.VideoWriter(out_path, fourcc, freq, imsize) + else: + out = None + + while keep_looping: + if current_token == last_sample_token: + keep_looping = False + + sample_record = self.nusc.get('sample', current_token) + + # Set filename of the image. + camera_token = sample_record['data'][channel] + cam = self.nusc.get('sample_data', camera_token) + filename = scene_record['name'] + '_' + channel + '_' + os.path.basename(cam['filename']) + + # Determine whether to render lidarseg points from ground truth or predictions. + pointsensor_token = sample_record['data']['LIDAR_TOP'] + if lidarseg_preds_folder: + lidarseg_preds_bin_path = osp.join(lidarseg_preds_folder, pointsensor_token + '_lidarseg.bin') + else: + lidarseg_preds_bin_path = None + + mat, no_points_in_mat = self._plot_points_and_bboxes(pointsensor_token, camera_token, + filter_lidarseg_labels=filter_lidarseg_labels, + lidarseg_preds_bin_path=lidarseg_preds_bin_path, + with_anns=with_anns, imsize=imsize, + dpi=dpi, line_width=2) + + if verbose: + cv2.imshow(name, mat) + + key = cv2.waitKey(1) + if key == 32: # If space is pressed, pause. + key = cv2.waitKey() + + if key == 27: # if ESC is pressed, exit. + plt.close('all') # To prevent figures from accumulating in memory. + # If rendering is stopped halfway, save whatever has been rendered so far into a video + # (if save_as_vid = True). + if save_as_vid: + out.write(mat) + out.release() + cv2.destroyAllWindows() + break + + plt.close('all') # To prevent figures from accumulating in memory. + + if save_as_vid: + out.write(mat) + elif not no_points_in_mat and out_folder: + cv2.imwrite(os.path.join(out_folder, filename), mat) + else: + pass + + next_token = sample_record['next'] + current_token = next_token + i += 1 + + cv2.destroyAllWindows() + + if save_as_vid: + assert total_num_samples == i, 'Error: There were supposed to be {} keyframes, ' \ + 'but only {} keyframes were processed'.format(total_num_samples, i) + out.release() + + def render_scene_lidarseg(self, + scene_token: str, + out_path: str = None, + filter_lidarseg_labels: Iterable[int] = None, + with_anns: bool = False, + imsize: Tuple[int, int] = (640, 360), + freq: float = 2, + verbose: bool = True, + dpi: int = 200, + lidarseg_preds_folder: str = None) -> None: + """ + Renders a full scene with all camera channels and the lidar segmentation labels for each camera. + The scene can be rendered either to a video or to a set of images. + :param scene_token: Unique identifier of scene to render. + :param out_path: Optional path to write a video file (must be .avi) of the rendered frames + (e.g. '~/Desktop/my_rendered_scene.avi), + :param filter_lidarseg_labels: Only show lidar points which belong to the given list of classes. If None + or the list is empty, all classes will be displayed. + :param with_anns: Whether to draw box annotations. + :param freq: Display frequency (Hz). + :param imsize: Size of image to render. The larger the slower this will run. + :param verbose: Whether to show the frames as they are being rendered. + :param dpi: Resolution of the output dots. + :param lidarseg_preds_folder: A path to the folder which contains the user's lidar segmentation predictions for + the scene. The naming convention of each .bin file in the folder should be + named in this format: _lidarseg.bin. + """ + assert hasattr(self.nusc, 'lidarseg'), 'Error: nuScenes-lidarseg not installed!' + + assert imsize[0] / imsize[1] == 16 / 9, "Aspect ratio should be 16/9." + + if lidarseg_preds_folder: + assert(os.path.isdir(lidarseg_preds_folder)), \ + 'Error: The lidarseg predictions folder ({}) does not exist.'.format(lidarseg_preds_folder) + + # Get records from DB. + scene_record = self.nusc.get('scene', scene_token) + + total_num_samples = scene_record['nbr_samples'] + first_sample_token = scene_record['first_sample_token'] + last_sample_token = scene_record['last_sample_token'] + + current_token = first_sample_token + + # Set some display parameters. + layout = { + 'CAM_FRONT_LEFT': (0, 0), + 'CAM_FRONT': (imsize[0], 0), + 'CAM_FRONT_RIGHT': (2 * imsize[0], 0), + 'CAM_BACK_LEFT': (0, imsize[1]), + 'CAM_BACK': (imsize[0], imsize[1]), + 'CAM_BACK_RIGHT': (2 * imsize[0], imsize[1]), + } + + horizontal_flip = ['CAM_BACK_LEFT', 'CAM_BACK', 'CAM_BACK_RIGHT'] # Flip these for aesthetic reasons. + + if verbose: + window_name = '{} {labels_type} (Space to pause, ESC to exit)'.format( + scene_record['name'], labels_type="(predictions)" if lidarseg_preds_folder else "") + cv2.namedWindow(window_name) + cv2.moveWindow(window_name, 0, 0) + else: + window_name = None + + slate = np.ones((2 * imsize[1], 3 * imsize[0], 3), np.uint8) + + if out_path: + path_to_file, filename = os.path.split(out_path) + assert os.path.isdir(path_to_file), 'Error: {} does not exist.'.format(path_to_file) + assert os.path.splitext(filename)[-1] == '.avi', 'Error: Video can only be saved in .avi format.' + fourcc = cv2.VideoWriter_fourcc(*'MJPG') + out = cv2.VideoWriter(out_path, fourcc, freq, slate.shape[1::-1]) + else: + out = None + + keep_looping = True + i = 0 + while keep_looping: + if current_token == last_sample_token: + keep_looping = False + + sample_record = self.nusc.get('sample', current_token) + + for camera_channel in layout: + pointsensor_token = sample_record['data']['LIDAR_TOP'] + camera_token = sample_record['data'][camera_channel] + + # Determine whether to render lidarseg points from ground truth or predictions. + if lidarseg_preds_folder: + lidarseg_preds_bin_path = osp.join(lidarseg_preds_folder, pointsensor_token + '_lidarseg.bin') + else: + lidarseg_preds_bin_path = None + + mat, _ = self._plot_points_and_bboxes(pointsensor_token, camera_token, + filter_lidarseg_labels=filter_lidarseg_labels, + lidarseg_preds_bin_path=lidarseg_preds_bin_path, + with_anns=with_anns, imsize=imsize, dpi=dpi, line_width=3) + + if camera_channel in horizontal_flip: + # Flip image horizontally. + mat = cv2.flip(mat, 1) + + slate[ + layout[camera_channel][1]: layout[camera_channel][1] + imsize[1], + layout[camera_channel][0]:layout[camera_channel][0] + imsize[0], : + ] = mat + + if verbose: + cv2.imshow(window_name, slate) + + key = cv2.waitKey(1) + if key == 32: # If space is pressed, pause. + key = cv2.waitKey() + + if key == 27: # if ESC is pressed, exit. + plt.close('all') # To prevent figures from accumulating in memory. + # If rendering is stopped halfway, save whatever has been rendered so far into a video + # (if save_as_vid = True). + if out_path: + out.write(slate) + out.release() + cv2.destroyAllWindows() + break + + plt.close('all') # To prevent figures from accumulating in memory. + + if out_path: + out.write(slate) + else: + pass + + next_token = sample_record['next'] + current_token = next_token + + i += 1 + + cv2.destroyAllWindows() + + if out_path: + assert total_num_samples == i, 'Error: There were supposed to be {} keyframes, ' \ + 'but only {} keyframes were processed'.format(total_num_samples, i) + out.release() diff --git a/python-sdk/nuscenes/prediction/input_representation/static_layers.py b/python-sdk/nuscenes/prediction/input_representation/static_layers.py index 881a38ca..83c8b330 100644 --- a/python-sdk/nuscenes/prediction/input_representation/static_layers.py +++ b/python-sdk/nuscenes/prediction/input_representation/static_layers.py @@ -20,10 +20,11 @@ Color = Tuple[float, float, float] -def load_all_maps(helper: PredictHelper) -> Dict[str, NuScenesMap]: +def load_all_maps(helper: PredictHelper, verbose: bool = False) -> Dict[str, NuScenesMap]: """ Loads all NuScenesMap instances for all available maps. :param helper: Instance of PredictHelper. + :param verbose: Whether to print to stdout. :return: Mapping from map-name to the NuScenesMap api instance. """ dataroot = helper.data.dataroot @@ -35,8 +36,8 @@ def load_all_maps(helper: PredictHelper) -> Dict[str, NuScenesMap]: for map_file in json_files: map_name = str(map_file.split(".")[0]) - - print(f'static_layers.py - Loading Map: {map_name}') + if verbose: + print(f'static_layers.py - Loading Map: {map_name}') maps[map_name] = NuScenesMap(dataroot, map_name=map_name) diff --git a/python-sdk/nuscenes/prediction/tests/test_predict_helper.py b/python-sdk/nuscenes/prediction/tests/test_predict_helper.py index 827fab68..d151be26 100644 --- a/python-sdk/nuscenes/prediction/tests/test_predict_helper.py +++ b/python-sdk/nuscenes/prediction/tests/test_predict_helper.py @@ -12,12 +12,16 @@ class MockNuScenes(NuScenes): - """ Mocks the NuScenes API needed to test PredictHelper. """ def __init__(self, sample_annotations: List[Dict[str, Any]], samples: List[Dict[str, Any]]): - + """ + Mocks the NuScenes API needed to test PredictHelper. + Note that we are skipping the call to the super class constructor on purpose to avoid loading the tables. + :param sample_annotations: The sample_annotations table used in this fake version of nuScenes. + :param samples: The sample table used in this fake version of nuScenes. + """ self._sample_annotation = {r['token']: r for r in sample_annotations} self._sample = {r['token']: r for r in samples} diff --git a/python-sdk/nuscenes/scripts/export_pointclouds_as_obj.py b/python-sdk/nuscenes/scripts/export_pointclouds_as_obj.py index 0c7156e7..ef85697e 100644 --- a/python-sdk/nuscenes/scripts/export_pointclouds_as_obj.py +++ b/python-sdk/nuscenes/scripts/export_pointclouds_as_obj.py @@ -2,8 +2,8 @@ # Code written by Holger Caesar, 2018. """ -Export fused point clouds of a scene to a Wavefront OBJ file. -This point-cloud can be viewed in your favorite 3D rendering tool, e.g. Meshlab or Maya. +Export fused pointclouds of a scene to a Wavefront OBJ file. +This pointcloud can be viewed in your favorite 3D rendering tool, e.g. Meshlab or Maya. """ import argparse @@ -30,9 +30,9 @@ def export_scene_pointcloud(nusc: NuScenes, verbose: bool = True) -> None: """ Export fused point clouds of a scene to a Wavefront OBJ file. - This point-cloud can be viewed in your favorite 3D rendering tool, e.g. Meshlab or Maya. + This pointcloud can be viewed in your favorite 3D rendering tool, e.g. Meshlab or Maya. :param nusc: NuScenes instance. - :param out_path: Output path to write the point-cloud to. + :param out_path: Output path to write the pointcloud to. :param scene_token: Unique identifier of scene to render. :param channel: Channel to render. :param min_dist: Minimum distance to ego vehicle below which points are dropped. @@ -58,7 +58,7 @@ def export_scene_pointcloud(nusc: NuScenes, cur_sd_rec = nusc.get('sample_data', cur_sd_rec['next']) sd_tokens.append(cur_sd_rec['token']) - # Write point-cloud. + # Write pointcloud. with open(out_path, 'w') as f: f.write("OBJ File:\n") @@ -114,7 +114,7 @@ def pointcloud_color_from_image(nusc: NuScenes, pointsensor_token: str, camera_token: str) -> Tuple[np.array, np.array]: """ - Given a point sensor (lidar/radar) token and camera sample_data token, load point-cloud and map it to the image + Given a point sensor (lidar/radar) token and camera sample_data token, load pointcloud and map it to the image plane, then retrieve the colors of the closest image pixels. :param nusc: NuScenes instance. :param pointsensor_token: Lidar/radar sample_data token. @@ -130,7 +130,7 @@ def pointcloud_color_from_image(nusc: NuScenes, im = Image.open(osp.join(nusc.dataroot, cam['filename'])) # Points live in the point sensor frame. So they need to be transformed via global to the image plane. - # First step: transform the point-cloud to the ego vehicle frame for the timestamp of the sweep. + # First step: transform the pointcloud to the ego vehicle frame for the timestamp of the sweep. cs_record = nusc.get('calibrated_sensor', pointsensor['calibrated_sensor_token']) pc.rotate(Quaternion(cs_record['rotation']).rotation_matrix) pc.translate(np.array(cs_record['translation'])) @@ -200,7 +200,7 @@ def pointcloud_color_from_image(nusc: NuScenes, if not out_dir == '' and not osp.isdir(out_dir): os.makedirs(out_dir) - # Extract point-cloud for the specified scene + # Extract pointcloud for the specified scene nusc = NuScenes() scene_tokens = [s['token'] for s in nusc.scene if s['name'] == scene_name] assert len(scene_tokens) == 1, 'Error: Invalid scene %s' % scene_name diff --git a/python-sdk/nuscenes/scripts/assert_download.py b/python-sdk/nuscenes/tests/assert_download.py similarity index 100% rename from python-sdk/nuscenes/scripts/assert_download.py rename to python-sdk/nuscenes/tests/assert_download.py diff --git a/python-sdk/nuscenes/tests/test_lidarseg.py b/python-sdk/nuscenes/tests/test_lidarseg.py new file mode 100644 index 00000000..8b636d7f --- /dev/null +++ b/python-sdk/nuscenes/tests/test_lidarseg.py @@ -0,0 +1,41 @@ +import unittest +import os + +from nuscenes import NuScenes + + +class TestNuScenesLidarseg(unittest.TestCase): + def setUp(self): + assert 'NUSCENES' in os.environ, 'Set NUSCENES env. variable to enable tests.' + self.nusc = NuScenes(version='v1.0-mini', dataroot=os.environ['NUSCENES'], verbose=False) + + def test_num_classes(self) -> None: + """ + Check that the correct number of classes (32 classes) are loaded. + """ + self.assertEqual(len(self.nusc.lidarseg_idx2name_mapping), 32) + + def test_num_colors(self) -> None: + """ + Check that the number of colors in the colormap matches the number of classes. + """ + num_classes = len(self.nusc.lidarseg_idx2name_mapping) + num_colors = len(self.nusc.colormap) + self.assertEqual(num_colors, num_classes) + + def test_classes(self) -> None: + """ + Check that the class names match the ones in the colormap, and are in the same order. + """ + classes_in_colormap = list(self.nusc.colormap.keys()) + for name, idx in self.nusc.lidarseg_name2idx_mapping.items(): + self.assertEqual(name, classes_in_colormap[idx]) + + +if __name__ == '__main__': + # Runs the tests without throwing errors. + test = TestNuScenesLidarseg() + test.setUp() + test.test_num_classes() + test.test_num_colors() + test.test_classes() diff --git a/python-sdk/nuscenes/utils/color_map.py b/python-sdk/nuscenes/utils/color_map.py new file mode 100644 index 00000000..ce4f2613 --- /dev/null +++ b/python-sdk/nuscenes/utils/color_map.py @@ -0,0 +1,45 @@ +from typing import Dict, Tuple + + +def get_colormap() -> Dict[str, Tuple[int, int, int]]: + """ + Get the defined colormap. + :return: A mapping from the class names to the respective RGB values. + """ + + classname_to_color = { # RGB. + "noise": (0, 0, 0), # Black. + "animal": (70, 130, 180), # Steelblue + "human.pedestrian.adult": (0, 0, 230), # Blue + "human.pedestrian.child": (135, 206, 235), # Skyblue, + "human.pedestrian.construction_worker": (100, 149, 237), # Cornflowerblue + "human.pedestrian.personal_mobility": (219, 112, 147), # Palevioletred + "human.pedestrian.police_officer": (0, 0, 128), # Navy, + "human.pedestrian.stroller": (240, 128, 128), # Lightcoral + "human.pedestrian.wheelchair": (138, 43, 226), # Blueviolet + "movable_object.barrier": (112, 128, 144), # Slategrey + "movable_object.debris": (210, 105, 30), # Chocolate + "movable_object.pushable_pullable": (105, 105, 105), # Dimgrey + "movable_object.trafficcone": (47, 79, 79), # Darkslategrey + "static_object.bicycle_rack": (188, 143, 143), # Rosybrown + "vehicle.bicycle": (220, 20, 60), # Crimson + "vehicle.bus.bendy": (255, 127, 80), # Coral + "vehicle.bus.rigid": (255, 69, 0), # Orangered + "vehicle.car": (255, 158, 0), # Orange + "vehicle.construction": (233, 150, 70), # Darksalmon + "vehicle.emergency.ambulance": (255, 83, 0), + "vehicle.emergency.police": (255, 215, 0), # Gold + "vehicle.motorcycle": (255, 61, 99), # Red + "vehicle.trailer": (255, 140, 0), # Darkorange + "vehicle.truck": (255, 99, 71), # Tomato + "flat.driveable_surface": (0, 207, 191), # nuTonomy green + "flat.other": (175, 0, 75), + "flat.sidewalk": (75, 0, 75), + "flat.terrain": (112, 180, 60), + "static.manmade": (222, 184, 135), # Burlywood + "static.other": (255, 228, 196), # Bisque + "static.vegetation": (0, 175, 0), # Green + "vehicle.ego": (255, 240, 245) + } + + return classname_to_color diff --git a/python-sdk/nuscenes/utils/map_mask.py b/python-sdk/nuscenes/utils/map_mask.py index 655d1ad4..0042e73d 100644 --- a/python-sdk/nuscenes/utils/map_mask.py +++ b/python-sdk/nuscenes/utils/map_mask.py @@ -16,7 +16,7 @@ class MapMask: def __init__(self, img_file: str, resolution: float = 0.1): """ - Init a map mask object that contains the semantic prior (drivable surface and sidewalks) mask. + Init a map mask object that contains the semantic prior (driveable surface and sidewalks) mask. :param img_file: File path to map png file. :param resolution: Map resolution in meters. """ diff --git a/python-sdk/tutorials/nuimages_tutorial.ipynb b/python-sdk/tutorials/nuimages_tutorial.ipynb new file mode 100644 index 00000000..96f39647 --- /dev/null +++ b/python-sdk/tutorials/nuimages_tutorial.ipynb @@ -0,0 +1,496 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# nuImages devkit tutorial\n", + "\n", + "Welcome to the nuImages tutorial.\n", + "This demo assumes the database itself is available at `/data/sets/nuimages`, and loads a mini version of the dataset." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## A Gentle Introduction to nuImages\n", + "\n", + "In this part of the tutorial, let us go through a top-down introduction of our database. Our dataset is structured as a relational database with tables, tokens and foreign keys. The tables are the following:\n", + "\n", + "1. `log` - Log from which the sample was extracted.\n", + "2. `sample` - An annotated camera image with an associated timestamp and past and future images and pointclouds.\n", + "3. `sample_data` - An image or pointcloud associated with a sample.\n", + "4. `ego_pose` - The vehicle ego pose and timestamp associated with a sample_data.\n", + "5. `sensor` - General information about a sensor, e.g. `CAM_BACK_LEFT`.\n", + "6. `calibrated_sensor` - Calibration information of a sensor in a log.\n", + "7. `category` - Taxonomy of object and surface categories (e.g. `vehicle.car`, `flat.driveable_surface`). \n", + "8. `attribute` - Property of an object that can change while the category remains the same.\n", + "9. `object_ann` - Bounding box and mask annotation of an object (e.g. car, adult).\n", + "10. `surface_ann` - Mask annotation of a surface (e.g. `flat.driveable surface` and `vehicle.ego`).\n", + "\n", + "The database schema is visualized below. For more information see the [schema page](https://github.com/nutonomy/nuscenes-devkit/blob/master/schema-nuimages.md).\n", + "![](https://www.nuscenes.org/public/images/nuimages-schema.svg)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Initialization\n", + "To initialize the dataset class, we run the code below. We can change the dataroot parameter if the dataset is installed in a different folder. We can also omit it to use the default setup. These will be useful further below." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "%matplotlib inline\n", + "%load_ext autoreload\n", + "%autoreload 2\n", + "from nuimages import NuImages\n", + "\n", + "nuim = NuImages(dataroot='/data/sets/nuimages', version='v1.0-mini', verbose=True, lazy=True)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Tables\n", + "\n", + "As described above, the NuImages class holds several tables. Each table is a list of records, and each record is a dictionary. For example the first record of the category table is stored at:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "nuim.category[0]" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "To see the list of all tables, simply refer to the `table_names` variable:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "nuim.table_names" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Indexing\n", + "\n", + "Since all tables are lists of dictionaries, we can use standard Python operations on them. A very common operation is to retrieve a particular record by its token. Since this operation takes linear time, we precompute an index that helps to access a record in constant time.\n", + "\n", + "Let us select the first image in this dataset version and split:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "sample_idx = 0\n", + "sample = nuim.sample[sample_idx]\n", + "sample" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We can also get the sample record from a sample token:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "sample = nuim.get('sample', sample['token'])\n", + "sample" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "What this does is actually to lookup the index. We see that this is the same index as we used in the first place." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "sample_idx_check = nuim.getind('sample', sample['token'])\n", + "assert sample_idx == sample_idx_check" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "From the sample, we can directly access the corresponding keyframe sample data. This will be useful further below." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "key_camera_token = sample['key_camera_token']\n", + "print(key_camera_token)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Lazy loading\n", + "\n", + "Initializing the NuImages instance above was very fast, as we did not actually load the tables. Rather, the class implements lazy loading that overwrites the internal `__getattr__()` function to load a table if it is not already stored in memory. The moment we accessed `category`, we could see the table being loaded from disk. To disable such notifications, just set `verbose=False` when initializing the NuImages object. Furthermore lazy loading can be disabled with `lazy=False`." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Rendering\n", + "\n", + "To render an image we use the `render_image()` function. We can see the boxes and masks for each object category, as well as the surface masks for ego vehicle and driveable surface. We use the following colors:\n", + "- vehicles: orange\n", + "- bicycles and motorcycles: red\n", + "- pedestrians: blue\n", + "- cones and barriers: gray\n", + "- driveable surface: teal / green\n", + "\n", + "At the top left corner of each box, we see the name of the object category (if `with_category=True`). We can also set `with_attributes=True` to print the attributes of each object (note that we can only set `with_attributes=True` to print the attributes of each object when `with_category=True`). In addition, we can specify if we want to see surfaces and objects, or only surfaces, or only objects, or neither by setting `with_annotations` to `all`, `surfaces`, `objects` and `none` respectively.\n", + "\n", + "Let us make the image bigger for better visibility by setting `render_scale=2`. We can also change the line width of the boxes using `box_line_width`. By setting it to -1, the line width adapts to the `render_scale`. Finally, we can render the image to disk using `out_path`." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "nuim.render_image(key_camera_token, annotation_type='all',\n", + " with_category=True, with_attributes=True, box_line_width=-1, render_scale=5)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let us find out which annotations are in that image." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "object_tokens, surface_tokens = nuim.list_anns(sample['token'])" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We can see the object_ann and surface_ann tokens. Let's again render the image, but only focus on the first object and the first surface annotation. We can use the `object_tokens` and `surface_tokens` arguments as shown below. We see that only one car and the driveable surface are rendered." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "scrolled": true + }, + "outputs": [], + "source": [ + "nuim.render_image(key_camera_token, with_category=True, object_tokens=[object_tokens[0]], surface_tokens=[surface_tokens[0]])" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "To get the raw data (i.e. the segmentation masks, both semantic and instance) of the above, we can use `get_segmentation()`." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import matplotlib.pyplot as plt\n", + "\n", + "semantic_mask, instance_mask = nuim.get_segmentation(key_camera_token)\n", + "\n", + "plt.figure(figsize=(32, 9))\n", + "\n", + "plt.subplot(1, 2, 1)\n", + "plt.imshow(semantic_mask)\n", + "plt.subplot(1, 2, 2)\n", + "plt.imshow(instance_mask)\n", + "\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Every annotated image (keyframe) comes with up to 6 past and 6 future images, spaced evenly at 500ms +- 250ms. However, a small percentage of the samples has less sample_datas, either because they were at the beginning or end of a log, or due to delays or dropped data packages.\n", + "`list_sample_content()` shows for each sample all the associated sample_datas." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "nuim.list_sample_content(sample['token'])" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Besides the annotated images, we can also render the 6 previous and 6 future images, which are not annotated. Let's select the next image, which is taken around 0.5s after the annotated image. We can either manually copy the token from the list above or use the `next` pointer of the `sample_data`." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "next_camera_token = nuim.get('sample_data', key_camera_token)['next']\n", + "next_camera_token" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now that we have the next token, let's render it. Note that we cannot render the annotations, as they don't exist.\n", + "\n", + "*Note: If you did not download the non-keyframes (sweeps), this will throw an error! We make sure to catch it here.*" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "try:\n", + " nuim.render_image(next_camera_token, annotation_type='none')\n", + "except Exception as e:\n", + " print('As expected, we encountered this error:', e)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "In this section we have presented a number of rendering functions. For convenience we also provide a script `render_images.py` that runs one or all of these rendering functions on a random subset of the 93k samples in nuImages. To run it, simply execute the following line in your command line. This will save image, depth, pointcloud and trajectory renderings of the front camera to the specified folder." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "`>> python nuimages/scripts/render_images.py --mode all --cam_name CAM_FRONT --out_dir ~/Downloads/nuImages --out_type image`" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Instead of rendering the annotated keyframe, we can also render a video of the 13 individual images, spaced at 2 Hz." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "`>> python nuimages/scripts/render_images.py --mode all --cam_name CAM_FRONT --out_dir ~/Downloads/nuImages --out_type video`" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Poses and CAN bus data" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The `ego_pose` provides the translation, rotation, rotation_rate, acceleration and speed measurements closest to each sample_data. We can visualize the trajectories of the ego vehicle throughout the 6s clip of each annotated keyframe. Here the red **x** indicates the start of the trajectory and the green **o** the position at the annotated keyframe.\n", + "We can set `rotation_yaw` to have the driving direction at the time of the annotated keyframe point \"upwards\" in the plot. We can also set `rotation_yaw` to None to use the default orientation (upwards pointing North). To get the raw data of this plot, use `get_ego_pose_data()` or `get_trajectory()`." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "nuim.render_trajectory(sample['token'], rotation_yaw=0, center_key_pose=True)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Statistics\n", + "\n", + "The `list_*()` methods are useful to get an overview of the dataset dimensions. Note that these statistics are always *for the current split* that we initialized the `NuImages` instance with, rather than the entire dataset." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "scrolled": true + }, + "outputs": [], + "source": [ + "nuim.list_logs()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "`list_categories()` lists the category frequencies, as well as the category name and description. Each category is either an object or a surface, but not both." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "scrolled": true + }, + "outputs": [], + "source": [ + "nuim.list_categories(sort_by='object_freq')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We can also specify a `sample_tokens` parameter for `list_categories()` to get the category statistics for a particular set of samples." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "sample_tokens = [nuim.sample[9]['token']]\n", + "nuim.list_categories(sample_tokens=sample_tokens)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "`list_attributes()` shows the frequency, name and description of all attributes:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "nuim.list_attributes(sort_by='freq')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "`list_cameras()` shows us how many camera entries and samples there are for each channel, such as the front camera.\n", + "Each camera uses slightly different intrinsic parameters, which will be provided in a future release." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "nuim.list_cameras()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "`list_sample_data_histogram()` shows a histogram of the number of images per annotated keyframe. Note that there are at most 13 images per keyframe. For the mini split shown here, all keyframes have 13 images." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "nuim.list_sample_data_histogram()" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.7.7" + } + }, + "nbformat": 4, + "nbformat_minor": 4 +} diff --git a/python-sdk/tutorials/nuscenes_lidarseg_tutorial.ipynb b/python-sdk/tutorials/nuscenes_lidarseg_tutorial.ipynb new file mode 100644 index 00000000..84233be7 --- /dev/null +++ b/python-sdk/tutorials/nuscenes_lidarseg_tutorial.ipynb @@ -0,0 +1,474 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# nuScenes-lidarseg tutorial\n", + "\n", + "Welcome to the nuScenes-lidarseg tutorial.\n", + "\n", + "This demo assumes that nuScenes is installed at `/data/sets/nuscenes`. The mini version (i.e. v1.0-mini) of the full dataset will be used for this demo." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Setup\n", + "To install the nuScenes-lidarseg expansion, download the dataset from https://www.nuscenes.org/download. Unpack the compressed file(s) into `/data/sets/nuscenes` and your folder structure should end up looking like this:\n", + "```\n", + "└── nuscenes \n", + " ├── Usual nuscenes folders (i.e. samples, sweep)\n", + " │\n", + " ├── lidarseg\n", + " │ └── v1.0-{mini, test, trainval} <- Contains the .bin files; a .bin file \n", + " │ contains the labels of the points in a \n", + " │ point cloud (note that v1.0-test does not \n", + " │ have any .bin files associated with it) \n", + " └── v1.0-{mini, test, trainval}\n", + " ├── Usual files (e.g. attribute.json, calibrated_sensor.json etc.)\n", + " ├── lidarseg.json <- contains the mapping of each .bin file to the token \n", + " └── category.json <- contains the categories of the labels (note that the \n", + " category.json from nuScenes v1.0 is overwritten)\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Initialization\n", + "Let's start by importing the necessary libraries:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "%matplotlib inline\n", + "\n", + "from nuscenes import NuScenes\n", + "\n", + "nusc = NuScenes(version='v1.0-mini', dataroot='/data/sets/nuscenes', verbose=True)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "As you can see, you do not need any extra libraries to use nuScenes-lidarseg. The original nuScenes devkit which you are familiar with has been extended so that you can use it seamlessly with nuScenes-lidarseg." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Statistics of lidarseg dataset for the v1.0-mini split\n", + "Let's get a quick feel of the lidarseg dataset by looking at what classes are in it and the number of points belonging to each class. The classes will be sorted in ascending order based on the number of points (since `sort_by='count'` below); you can also sort the classes by class name or class index by setting `sort_by='name'` or `sort_by='index'` respectively." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "nusc.list_lidarseg_categories(sort_by='count')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "With `list_lidarseg_categories`, you can get the index which each class name belongs to by looking at the leftmost column. You can also get a mapping of the indices to the class names from the `lidarseg_idx2name_mapping` attribute of the NuScenes class." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "nusc.lidarseg_idx2name_mapping" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Conversely, you can get the mapping of the class names to the indices from the `lidarseg_name2idx_mapping` attribute of the NuScenes class." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "nusc.lidarseg_name2idx_mapping" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Pick a sample token\n", + "Let's pick a sample to use for this tutorial." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "my_sample = nusc.sample[87]" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Get statistics of a lidarseg sample token\n", + "Now let's take a look at what classes are present in the pointcloud of this particular sample." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "nusc.get_sample_lidarseg_stats(my_sample['token'], sort_by='count')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "By doing `sort_by='count'`, the classes and their respective frequency counts are printed in ascending order; you can also do `sort_by='name'` and `sort_by='index'` here as well." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Render the lidarseg labels in the bird's eye view of a pointcloud\n", + "In the original nuScenes devkit, you would pass a sample data token into ```render_sample_data``` to render a bird's eye view of the pointcloud. However, the points would be colored according to the distance from the ego vehicle. Now with the extended nuScenes devkit, all you need to do is set ```show_lidarseg=True``` to visualize the class labels of the pointcloud." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "sample_data_token = my_sample['data']['LIDAR_TOP']\n", + "nusc.render_sample_data(sample_data_token,\n", + " with_anns=False,\n", + " show_lidarseg=True)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "But what if you wanted to focus on only certain classes? Given the statistics of the pointcloud printed out previously, let's say you are only interested in trucks and trailers. You could see the class indices belonging to those classes from the statistics and then pass an array of those indices into ```filter_lidarseg_labels``` like so:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "nusc.render_sample_data(sample_data_token,\n", + " with_anns=False,\n", + " show_lidarseg=True,\n", + " filter_lidarseg_labels=[22, 23])" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now only points in the pointcloud belonging to trucks and trailers are filtered out for your viewing pleasure. " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "In addition, you can display a legend which indicates the color for each class by using `show_lidarseg_legend`." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "nusc.render_sample_data(sample_data_token,\n", + " with_anns=False,\n", + " show_lidarseg=True,\n", + " show_lidarseg_legend=True)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Render lidarseg labels in image\n", + "If you wanted to superimpose the pointcloud into the corresponding image from a camera, you can use ```render_pointcloud_in_image``` like what you would do with the original nuScenes devkit, but set ```show_lidarseg=True``` (remember to set ```render_intensity=False```). Similar to ```render_sample_data```, you can filter to see only certain classes using ```filter_lidarseg_labels```. And you can use ```show_lidarseg_legend``` to display a legend in the rendering." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "nusc.render_pointcloud_in_image(my_sample['token'],\n", + " pointsensor_channel='LIDAR_TOP',\n", + " camera_channel='CAM_BACK',\n", + " render_intensity=False,\n", + " show_lidarseg=True,\n", + " filter_lidarseg_labels=[22, 23],\n", + " show_lidarseg_legend=True)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Render sample (i.e. lidar, radar and all cameras)\n", + "Of course, like in the original nuScenes devkit, you can render all the sensors at once with ```render_sample```. In this extended nuScenes devkit, you can set ```show_lidarseg=True``` to see the lidarseg labels. Similar to the above methods, you can use ```filter_lidarseg_labels``` to display only the classes you wish to see." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "nusc.render_sample(my_sample['token'],\n", + " show_lidarseg=True,\n", + " filter_lidarseg_labels=[22, 23])" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Render a scene for a given camera sensor with lidarseg labels\n", + "You can also render an entire scene with the lidarseg labels for a camera of your choosing (the ```filter_lidarseg_labels``` argument can be used here as well)." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's pick a scene first:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "my_scene = nusc.scene[0]" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We then pass the scene token into ```render_scene_channel_lidarseg``` indicating that we are only interested in construction vehicles and man-made objects (here, we set `verbose=True` to produce a window which will allows us to see the frames as they are being random). \n", + "\n", + "In addition, you can use `dpi` (to adjust the size of the lidar points) and `imsize` (to adjust the size of the rendered image) to tune the aesthetics of the renderings to your liking.\n", + "\n", + "(Note: the following code is commented out as it crashes in Jupyter notebooks.)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# import os\n", + "# nusc.render_scene_channel_lidarseg(my_scene['token'], \n", + "# 'CAM_BACK', \n", + "# filter_lidarseg_labels=[18, 28],\n", + "# verbose=True, \n", + "# dpi=100,\n", + "# imsize=(1280, 720))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "To save the renderings, you can pass a path to a folder you want to save the images to via the ```out_folder``` argument, and either `video` or `image` to `render_mode`.\n", + "\n", + "(Note: the following code is commented out as it crashes in Jupyter notebooks.)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# nusc.render_scene_channel_lidarseg(my_scene['token'],\n", + "# 'CAM_BACK',\n", + "# filter_lidarseg_labels=[18, 28],\n", + "# verbose=True,\n", + "# dpi=100,\n", + "# imsize=(1280, 720),\n", + "# render_mode='video',\n", + "# out_folder=os.path.expanduser('~/Desktop/my_folder'))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "When `render_mode='image'`, only frames which contain points (after the filter has been applied) will be saved as images." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Render a scene for all cameras with lidarseg labels\n", + "You can also render the entire scene for all cameras at once with the lidarseg labels as a video. Let's say in this case, we are interested in points belonging to driveable surfaces and cars.\n", + "\n", + "(Note: the following code is commented out as it crashes in Jupyter notebooks.)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "scrolled": true + }, + "outputs": [], + "source": [ + "# nusc.render_scene_lidarseg(my_scene['token'], \n", + "# filter_lidarseg_labels=[17, 24],\n", + "# verbose=True,\n", + "# dpi=100,\n", + "# out_path=os.path.expanduser('~/Desktop/my_rendered_scene.avi'))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Visualizing LIDAR segmentation predictions\n", + "In all the above functions, the labels of the LIDAR pointcloud which have been rendered are the ground truth. If you have trained a model to segment LIDAR pointclouds and have run it on the nuScenes-lidarseg dataset, you can visualize your model's predictions with nuScenes-lidarseg as well!\n", + "\n", + "Each of your .bin files should be a `numpy.uint8` array; as a tip, you can save your predictions as follows:\n", + "```\n", + "np.array(predictions).astype(np.uint8).tofile(bin_file_out)\n", + "```\n", + "- `predictions`: The predictions from your model (e.g. `[30, 5, 18, ..., 30]`)\n", + "- `bin_file_out`: The path to write your .bin file to (e.g. `/some/folder/_lidarseg.bin`)\n", + "\n", + "Then you simply need to pass the path to the .bin file where your predictions for the given sample are to `lidarseg_preds_bin_path` for these functions:\n", + "- `list_lidarseg_categories`\n", + "- `render_sample_data`\n", + "- `render_pointcloud_in_image`\n", + "- `render_sample` \n", + "\n", + "For example, let's assume the predictions for `my_sample` is stored at `/data/sets/nuscenes/lidarseg/v1.0-mini` with the format `_lidarseg.bin`:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import os\n", + "\n", + "my_sample = nusc.sample[87]\n", + "sample_data_token = my_sample['data']['LIDAR_TOP']\n", + "my_predictions_bin_file = os.path.join('/data/sets/nuscenes/lidarseg/v1.0-mini', sample_data_token + '_lidarseg.bin')\n", + "\n", + "nusc.render_pointcloud_in_image(my_sample['token'],\n", + " pointsensor_channel='LIDAR_TOP',\n", + " camera_channel='CAM_BACK',\n", + " render_intensity=False,\n", + " show_lidarseg=True,\n", + " filter_lidarseg_labels=[22, 23],\n", + " show_lidarseg_legend=True,\n", + " lidarseg_preds_bin_path=my_predictions_bin_file)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "For these functions that render an entire scene, you will need to pass the path to the folder which contains the .bin files for each sample in a scene to `lidarseg_preds_folder`:\n", + "- `render_scene_channel_lidarseg`\n", + "- `render_scene_lidarseg`\n", + "\n", + "Pay special attention that **each set of predictions in the folder _must_ be a `.bin` file and named as `_lidarseg.bin`**.\n", + "\n", + "(Note: the following code is commented out as it crashes in Jupyter notebooks.)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# my_scene = nusc.scene[0]\n", + "# my_folder_of_predictions = '/data/sets/nuscenes/lidarseg/v1.0-mini'\n", + "\n", + "# nusc.render_scene_channel_lidarseg(my_scene['token'], \n", + "# 'CAM_BACK', \n", + "# filter_lidarseg_labels=[17, 24],\n", + "# verbose=True, \n", + "# imsize=(1280, 720),\n", + "# lidarseg_preds_folder=my_folder_of_predictions)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Conclusion\n", + "And this brings us to the end of the tutorial for nuScenes-lidarseg, enjoy!" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.7.7" + } + }, + "nbformat": 4, + "nbformat_minor": 4 +} diff --git a/python-sdk/tutorials/nuscenes_basics_tutorial.ipynb b/python-sdk/tutorials/nuscenes_tutorial.ipynb similarity index 94% rename from python-sdk/tutorials/nuscenes_basics_tutorial.ipynb rename to python-sdk/tutorials/nuscenes_tutorial.ipynb index 70de38d3..ca6fbd96 100644 --- a/python-sdk/tutorials/nuscenes_basics_tutorial.ipynb +++ b/python-sdk/tutorials/nuscenes_tutorial.ipynb @@ -6,9 +6,40 @@ "source": [ "# nuScenes devkit tutorial\n", "\n", - "Welcome to the nuScenes tutorial.\n", + "Welcome to the nuScenes tutorial. This demo assumes the database itself is available at `/data/sets/nuscenes`, and loads a mini version of the full dataset." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## A Gentle Introduction to nuScenes\n", "\n", - "This demo assumes the database itself is available at `/data/sets/nuscenes`, and loads a mini version of the full dataset." + "In this part of the tutorial, let us go through a top-down introduction of our database. Our dataset comprises of elemental building blocks that are the following:\n", + "\n", + "1. `log` - Log information from which the data was extracted.\n", + "2. `scene` - 20 second snippet of a car's journey.\n", + "3. `sample` - An annotated snapshot of a scene at a particular timestamp.\n", + "4. `sample_data` - Data collected from a particular sensor.\n", + "5. `ego_pose` - Ego vehicle poses at a particular timestamp.\n", + "6. `sensor` - A specific sensor type.\n", + "7. `calibrated sensor` - Definition of a particular sensor as calibrated on a particular vehicle.\n", + "8. `instance` - Enumeration of all object instance we observed.\n", + "9. `category` - Taxonomy of object categories (e.g. vehicle, human). \n", + "10. `attribute` - Property of an instance that can change while the category remains the same.\n", + "11. `visibility` - Fraction of pixels visible in all the images collected from 6 different cameras.\n", + "12. `sample_annotation` - An annotated instance of an object within our interest.\n", + "13. `map` - Map data that is stored as binary semantic masks from a top-down view.\n", + "\n", + "The database schema is visualized below. For more information see the [nuScenes schema](https://github.com/nutonomy/nuscenes-devkit/blob/master/docs/schema_nuscenes.md) page.\n", + "![](https://www.nuscenes.org/public/images/nuscenes-schema.svg)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Initialization" ] }, { @@ -27,23 +58,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "## A Gentle Introduction to nuScenes\n", - "\n", - "In this part of the tutorial, let us go through a top-down introduction of our database. This section is an elaboration of `schema.md`. Our dataset comprises of elemental building blocks that are the following:\n", - "\n", - "1. `scene` - 20 second snippet of a car's journey.\n", - "2. `sample` - An annotated snapshot of a scene at a particular timestamp.\n", - "3. `sample_data` - Data collected from a particular sensor.\n", - "4. `sample_annotation` - An annotated instance of an object within our interest.\n", - "5. `instance` - Enumeration of all object instance we observed.\n", - "6. `category` - Taxonomy of object categories (e.g. vehicle, human). \n", - "7. `attribute` - Property of an instance that can change while the category remains the same.\n", - "8. `visibility` - Fraction of pixels visible in all the images collected from 6 different cameras.. \n", - "9. `sensor` - A specific sensor type.\n", - "10. `calibrated sensor` - Definition of a particular sensor as calibrated on a particular vehicle.\n", - "11. `ego_pose` - Ego vehicle poses at a particular timestamp.\n", - "12. `log` - Log information from which the data was extracted.\n", - "13. `map` - Map data that is stored as binary semantic masks from a top-down view." + "## A look at the dataset" ] }, { @@ -366,7 +381,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Refer to `instructions.md` for the definitions of the different categories." + "Refer to `instructions_nuscenes.md` for the definitions of the different categories." ] }, { @@ -660,7 +675,9 @@ { "cell_type": "code", "execution_count": null, - "metadata": {}, + "metadata": { + "scrolled": true + }, "outputs": [], "source": [ "nusc.map[0]" @@ -817,7 +834,9 @@ { "cell_type": "code", "execution_count": null, - "metadata": {}, + "metadata": { + "scrolled": true + }, "outputs": [], "source": [ "ann_tokens_field2token = set(ann_tokens)\n", @@ -1280,7 +1299,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.7.5" + "version": "3.7.7" } }, "nbformat": 4, diff --git a/python-sdk/tutorials/prediction_tutorial.ipynb b/python-sdk/tutorials/prediction_tutorial.ipynb index 678c8213..9802cf26 100644 --- a/python-sdk/tutorials/prediction_tutorial.ipynb +++ b/python-sdk/tutorials/prediction_tutorial.ipynb @@ -5,7 +5,7 @@ "metadata": {}, "source": [ "# nuScenes prediction tutorial\n", - "" + "" ] }, { @@ -44,7 +44,7 @@ "source": [ "## 1. Data Splits for the Prediction Challenge\n", "\n", - "This section assumes basic familiarity with the nuScenes [schema](https://www.nuscenes.org/data-format?externalData=all&mapData=all&modalities=Any).\n", + "This section assumes basic familiarity with the nuScenes [schema](https://www.nuscenes.org/nuscenes#data-format).\n", "\n", "The goal of the nuScenes prediction challenge is to predict the future location of agents in the nuScenes dataset. Agents are indexed by an instance token and a sample token. To get a list of agents in the train and val split of the challenge, we provide a function called `get_prediction_challenge_split`.\n", "\n", diff --git a/setup/Dockerfile b/setup/Dockerfile new file mode 100644 index 00000000..ced1d81c --- /dev/null +++ b/setup/Dockerfile @@ -0,0 +1,30 @@ +FROM continuumio/miniconda3:4.6.14 +ENV PATH /opt/conda/bin:$PATH + +RUN apt-get update && \ + apt-get install -y --no-install-recommends \ + libsm6 \ + libxext6 \ + libxrender-dev \ + libgl1-mesa-glx \ + libglib2.0-0 \ + xvfb && \ + rm -rf /var/lib/apt/lists/* + +WORKDIR /nuscenes-dev +# create conda nuscenes env +ARG PYTHON_VERSION +RUN bash -c "conda create -y -n nuscenes python=${PYTHON_VERSION} \ + && source activate nuscenes \ + && conda clean --yes --all" + +COPY setup/requirements.txt . +COPY setup/requirements/ requirements/ +# Install Python dependencies inside of the Docker image via pip & Conda. +# pycocotools installed from conda-forge +RUN bash -c "source activate nuscenes \ + && find . -name "\\*.txt" -exec sed -i -e '/pycocotools/d' {} \; \ + && pip install --no-cache -r /nuscenes-dev/requirements.txt \ + && conda config --append channels conda-forge \ + && conda install --yes pycocotools \ + && conda clean --yes --all" \ No newline at end of file diff --git a/setup/Dockerfile_3.6 b/setup/Dockerfile_3.6 deleted file mode 100644 index 8826018a..00000000 --- a/setup/Dockerfile_3.6 +++ /dev/null @@ -1,23 +0,0 @@ -FROM continuumio/miniconda3:4.6.14 -ENV PATH /opt/conda/bin:$PATH -ENV PYTHONPATH=/nuscenes-dev/python-sdk - -RUN apt-get update && \ - apt-get install -y --no-install-recommends \ - libsm6 \ - libxext6 \ - libxrender-dev \ - libgl1-mesa-glx \ - libglib2.0-0 \ - xvfb && \ - rm -rf /var/lib/apt/lists/* - - -WORKDIR /nuscenes-dev -COPY setup/requirements.txt /nuscenes-dev -# Install Python dependencies inside of the Docker image via Conda. -RUN bash -c "conda create -y -n nuscenes python=3.6; source activate nuscenes && \ - pip install -r /nuscenes-dev/requirements.txt \ - && conda clean --yes --all" - -COPY . /nuscenes-dev diff --git a/setup/Dockerfile_3.7 b/setup/Dockerfile_3.7 deleted file mode 100644 index 0ad37f09..00000000 --- a/setup/Dockerfile_3.7 +++ /dev/null @@ -1,23 +0,0 @@ -FROM continuumio/miniconda3:4.6.14 -ENV PATH /opt/conda/bin:$PATH -ENV PYTHONPATH=/nuscenes-dev/python-sdk - -RUN apt-get update && \ - apt-get install -y --no-install-recommends \ - libsm6 \ - libxext6 \ - libxrender-dev \ - libgl1-mesa-glx \ - libglib2.0-0 \ - xvfb && \ - rm -rf /var/lib/apt/lists/* - - -WORKDIR /nuscenes-dev -COPY setup/requirements.txt /nuscenes-dev -# Install Python dependencies inside of the Docker image via Conda. -RUN bash -c "conda create -y -n nuscenes python=3.7; source activate nuscenes && \ - pip install -r /nuscenes-dev/requirements.txt \ - && conda clean --yes --all" - -COPY . /nuscenes-dev diff --git a/setup/Jenkinsfile b/setup/Jenkinsfile index 44070029..58641c75 100644 --- a/setup/Jenkinsfile +++ b/setup/Jenkinsfile @@ -1,86 +1,119 @@ +@Library('jenkins-shared-libraries') _ + +// Aborts previous builds of the same PR- +if( env.BRANCH_NAME != null && env.BRANCH_NAME != "master" ) { + def buildNumber = env.BUILD_NUMBER as int + if (buildNumber > 1) milestone(buildNumber - 1) + milestone(buildNumber) +} + +def update_deps() { + sh '''#!/usr/bin/env bash + set -e + source activate nuscenes + find . -name "*.txt" -exec sed -i -e '/pycocotools/d' {} \\; + pip install --no-cache -r /nuscenes-dev/requirements.txt + conda install --yes pycocotools + ''' +} + +def kubeagent(name, image) { + return jnlp.docker(name: name, + docker_image: image, + cpu: 7, maxcpu: 8, + memory: "8G", maxmemory: "30G", + cloud: "boston", + yaml: """spec: + containers: + - name: docker + volumeMounts: + - mountPath: /data/ + name: nudeep-ci + subPath: data + volumes: + - name: nudeep-ci + persistentVolumeClaim: + claimName: nudeep-ci""") +} + pipeline { agent { - kubernetes { - label 'nuscenes-builder-' + UUID.randomUUID().toString() - cloud 'boston' - yamlFile 'setup/docker.yaml' - }// kubernetes + kubernetes (jnlp.docker(name: "nuscenes-builder", + cpu: 2, maxcpu: 2, + memory: "2G", maxmemory: "4G", + cloud: "boston")) } // agent environment { - PROD_IMAGE = "nuscenes:production" - TEST_IMAGE_3_6 = "registry-local.nutonomy.team:5000/nuscenes-test:kube${UUID.nameUUIDFromBytes(new String(env.BUILD_TAG).getBytes())}" - TEST_IMAGE_3_7 = "registry-local.nutonomy.team:5000/nuscenes-test:kube${UUID.nameUUIDFromBytes(new String(env.BUILD_TAG).getBytes())}" + PROD_IMAGE = "233885420847.dkr.ecr.us-east-1.amazonaws.com/nuscenes-test:production" + TEST_IMAGE = "233885420847.dkr.ecr.us-east-1.amazonaws.com/nuscenes-test:1.0" + TEST_IMAGE_3_6 = "${env.TEST_IMAGE}-3.6" + TEST_IMAGE_3_7 = "${env.TEST_IMAGE}-3.7" NUSCENES = "/data/sets/nuscenes" + NUIMAGES = "/data/sets/nuimages" + PYTHONPATH = "${env.WORKSPACE}/python-sdk" + PYTHONUNBUFFERED = "1" + } + + parameters { + booleanParam(name: 'REBUILD_TEST_IMAGE', defaultValue: false, description: 'rebuild docker test image') } stages { - stage('Build'){ - steps { - container('docker') { - // Build the Docker image, and then run python -m unittest inside - // an activated Conda environment inside of the container. - sh """#!/bin/bash - set -eux - docker build -t $TEST_IMAGE_3_6 -f setup/Dockerfile_3.6 . - docker push $TEST_IMAGE_3_6 - - docker build -t $TEST_IMAGE_3_7 -f setup/Dockerfile_3.7 . - docker push $TEST_IMAGE_3_7 - """ - } // container - } // steps - } // stage + stage('Build test docker image') { + when { + expression { return params.REBUILD_TEST_IMAGE } + } + failFast true + parallel { + stage('Build 3.6') { + steps { + withAWS(credentials: 'ecr-233') { + container('docker') { + // Build the Docker image, and then run python -m unittest inside + // an activated Conda environment inside of the container. + sh """#!/bin/bash + set -eux + docker build --build-arg PYTHON_VERSION=3.6 -t $TEST_IMAGE_3_6 -f setup/Dockerfile . + `aws ecr get-login --no-include-email --region us-east-1` + docker push $TEST_IMAGE_3_6 + """ + } // container + } + } // steps + } // stage + stage('Build 3.7') { + steps { + withAWS(credentials: 'ecr-233') { + container('docker') { + // Build the Docker image, and then run python -m unittest inside + // an activated Conda environment inside of the container. + sh """#!/bin/bash + set -eux + docker build --build-arg PYTHON_VERSION=3.7 -t $TEST_IMAGE_3_7 -f setup/Dockerfile . + `aws ecr get-login --no-include-email --region us-east-1` + docker push $TEST_IMAGE_3_7 + """ + } // container + } + } // steps + } // stage + } + } stage('Tests') { failFast true parallel { - stage('Test 3.6'){ + stage('Test 3.6') { agent { - kubernetes { - label 'nuscenes-test3.6-' + UUID.randomUUID().toString() - cloud 'boston' - yaml """ - apiVersion: v1 - kind: Pod - metadata: - labels: - app: nuscenes - spec: - containers: - - name: jnlp - image: registry.nutonomy.com:5000/nu/jnlp-slave:3.19-1-lfs - imagePullPolicy: Always - - name: docker - image: $TEST_IMAGE_3_6 - command: - - cat - tty: true - volumeMounts: - - mountPath: /var/run/docker.sock - name: docker - - mountPath: /data/ - name: nudeep-ci - subPath: data - imagePullSecrets: - - name: regcredjenkins - volumes: - - name: docker - hostPath: - path: /var/run/docker.sock - - name: nudeep-ci - persistentVolumeClaim: - claimName: nudeep-ci - env: - - name: NUSCENES - value: $NUSCENES - """ - }// kubernetes + kubernetes(kubeagent("nuscenes-test3.6", + env.TEST_IMAGE_3_6)) } // agent steps { container('docker') { + update_deps() sh """#!/bin/bash set -e source activate nuscenes && python -m unittest discover python-sdk @@ -90,51 +123,15 @@ pipeline { } // steps } // stage - stage('Test 3.7'){ + stage('Test 3.7') { agent { - kubernetes { - label 'nuscenes-test3.7-' + UUID.randomUUID().toString() - cloud 'boston' - yaml """ - apiVersion: v1 - kind: Pod - metadata: - labels: - app: nuscenes - spec: - containers: - - name: jnlp - image: registry.nutonomy.com:5000/nu/jnlp-slave:3.19-1-lfs - imagePullPolicy: Always - - name: docker - image: $TEST_IMAGE_3_7 - command: - - cat - tty: true - volumeMounts: - - mountPath: /var/run/docker.sock - name: docker - - mountPath: /data/ - name: nudeep-ci - subPath: data - imagePullSecrets: - - name: regcredjenkins - volumes: - - name: docker - hostPath: - path: /var/run/docker.sock - - name: nudeep-ci - persistentVolumeClaim: - claimName: nudeep-ci - env: - - name: NUSCENES - value: $NUSCENES - """ - }// kubernetes + kubernetes(kubeagent("nuscenes-test3.7", + env.TEST_IMAGE_3_7)) } // agent steps { container('docker') { + update_deps() sh """#!/bin/bash set -e source activate nuscenes && python -m unittest discover python-sdk diff --git a/setup/docker.yaml b/setup/docker.yaml deleted file mode 100644 index 3573a9e4..00000000 --- a/setup/docker.yaml +++ /dev/null @@ -1,24 +0,0 @@ -apiVersion: v1 -kind: Pod -metadata: - labels: - app: nuscenes-docker -spec: - containers: - - name: jnlp - image: registry.nutonomy.com:5000/nu/jnlp-slave:3.19-1-lfs - imagePullPolicy: Always - - name: docker - image: registry.nutonomy.com:5000/nu/docker-bash:latest - command: - - cat - tty: true - volumeMounts: - - mountPath: /var/run/docker.sock - name: docker - imagePullSecrets: - - name: regcredjenkins - volumes: - - name: docker - hostPath: - path: /var/run/docker.sock diff --git a/setup/requirements.txt b/setup/requirements.txt index c13946db..9e554c74 100644 --- a/setup/requirements.txt +++ b/setup/requirements.txt @@ -1,17 +1,4 @@ -cachetools -descartes -fire -jupyter -matplotlib -motmetrics<=1.1.3 -numpy -opencv-python -pandas>=0.24 -Pillow<=6.2.1 # Latest Pillow is incompatible with current torchvision, https://github.com/pytorch/vision/issues/1712 -pyquaternion>=0.9.5 -scikit-learn -scipy -Shapely -torch>=1.3.1 -torchvision>=0.4.2 -tqdm +-r requirements/requirements_base.txt +-r requirements/requirements_prediction.txt +-r requirements/requirements_tracking.txt +-r requirements/requirements_nuimages.txt diff --git a/setup/requirements/requirements_base.txt b/setup/requirements/requirements_base.txt new file mode 100644 index 00000000..4067e6e9 --- /dev/null +++ b/setup/requirements/requirements_base.txt @@ -0,0 +1,13 @@ +cachetools +descartes +fire +jupyter +matplotlib +numpy +opencv-python +Pillow +pyquaternion>=0.9.5 +scikit-learn +scipy +Shapely +tqdm diff --git a/setup/requirements/requirements_nuimages.txt b/setup/requirements/requirements_nuimages.txt new file mode 100644 index 00000000..60585d74 --- /dev/null +++ b/setup/requirements/requirements_nuimages.txt @@ -0,0 +1 @@ +pycocotools>=2.0.1 diff --git a/setup/requirements/requirements_prediction.txt b/setup/requirements/requirements_prediction.txt new file mode 100644 index 00000000..a6b6243d --- /dev/null +++ b/setup/requirements/requirements_prediction.txt @@ -0,0 +1,2 @@ +torch>=1.3.1 +torchvision>=0.4.2 diff --git a/setup/requirements/requirements_tracking.txt b/setup/requirements/requirements_tracking.txt new file mode 100644 index 00000000..abcc4d75 --- /dev/null +++ b/setup/requirements/requirements_tracking.txt @@ -0,0 +1,2 @@ +motmetrics<=1.1.3 +pandas>=0.24 diff --git a/setup/setup.py b/setup/setup.py index 9ba8258e..184ee095 100644 --- a/setup/setup.py +++ b/setup/setup.py @@ -5,8 +5,14 @@ with open('../README.md', 'r') as fh: long_description = fh.read() +# Since nuScenes 2.0 the requirements are stored in separate files. with open('requirements.txt') as f: - requirements = f.read().splitlines() + req_paths = f.read().splitlines() +requirements = [] +for req_path in req_paths: + req_path = req_path.replace('-r ', '') + with open(req_path) as f: + requirements += f.read().splitlines() def get_dirlist(_rootdir): @@ -31,9 +37,9 @@ def get_dirlist(_rootdir): setuptools.setup( name='nuscenes-devkit', - version='1.0.9', + version='1.1.0', author='Holger Caesar, Oscar Beijbom, Qiang Xu, Varun Bankiti, Alex H. Lang, Sourabh Vora, Venice Erin Liong, ' - 'Sergi Widjaja, Kiwoo Shin, Caglayan Dicle et al.', + 'Sergi Widjaja, Kiwoo Shin, Caglayan Dicle, Freddy Boulton, Whye Kit Fong, Asha Asvathaman et al.', author_email='nuscenes@nutonomy.com', description='The official devkit of the nuScenes dataset (www.nuscenes.org).', long_description=long_description, diff --git a/setup/test_tutorial.sh b/setup/test_tutorial.sh index f79ad5df..da236fce 100755 --- a/setup/test_tutorial.sh +++ b/setup/test_tutorial.sh @@ -5,13 +5,15 @@ set -ex source activate nuscenes # Generate python script from Jupyter notebook and then copy into Docker image. -jupyter nbconvert --to python python-sdk/tutorials/nuscenes_basics_tutorial.ipynb || { echo "Failed to convert nuscenes_basics_tutorial notebook to python script"; exit 1; } +jupyter nbconvert --to python python-sdk/tutorials/nuscenes_tutorial.ipynb || { echo "Failed to convert nuscenes_tutorial notebook to python script"; exit 1; } +jupyter nbconvert --to python python-sdk/tutorials/nuimages_tutorial.ipynb || { echo "Failed to convert nuimages_tutorial notebook to python script"; exit 1; } jupyter nbconvert --to python python-sdk/tutorials/can_bus_tutorial.ipynb || { echo "Failed to convert can_bus_tutorial notebook to python script"; exit 1; } jupyter nbconvert --to python python-sdk/tutorials/map_expansion_tutorial.ipynb || { echo "Failed to convert map_expansion_tutorial notebook to python script"; exit 1; } jupyter nbconvert --to python python-sdk/tutorials/prediction_tutorial.ipynb || { echo "Failed to convert prediction notebook to python script"; exit 1; } # Remove extraneous matplot inline command and comment out any render* methods. -sed -i.bak "/get_ipython.*/d; s/\(nusc\.render.*\)/#\1/" python-sdk/tutorials/nuscenes_basics_tutorial.py || { echo "error in sed command"; exit 1; } +sed -i.bak "/get_ipython.*/d; s/\(nusc\.render.*\)/#\1/" python-sdk/tutorials/nuscenes_tutorial.py || { echo "error in sed command"; exit 1; } +sed -i.bak "/get_ipython.*/d; s/\(nusc\.render.*\)/#\1/" python-sdk/tutorials/nuimages_tutorial.py || { echo "error in sed command"; exit 1; } sed -i.bak "/get_ipython.*/d; s/\(nusc_can.plot.*\)/#\1/" python-sdk/tutorials/can_bus_tutorial.py || { echo "error in sed command"; exit 1; } sed -i.bak "/get_ipython.*/d; s/\(^plt.*\)/#\1/" python-sdk/tutorials/can_bus_tutorial.py || { echo "error in sed command"; exit 1; } sed -i.bak "/get_ipython.*/d; s/\(fig, ax.*\)/#\1/" python-sdk/tutorials/map_expansion_tutorial.py || { echo "error in sed command"; exit 1; } @@ -20,7 +22,8 @@ sed -i.bak "/get_ipython.*/d; s/\(ego_poses = .*\)/#\1/" python-sdk/tutorials/m sed -i.bak "/get_ipython.*/d; s/\(plt.imshow.*\)/#\1/" python-sdk/tutorials/prediction_tutorial.py || { echo "error in sed command"; exit 1; } # Run tutorial -xvfb-run python python-sdk/tutorials/nuscenes_basics_tutorial.py +xvfb-run python python-sdk/tutorials/nuscenes_tutorial.py +# xvfb-run python python-sdk/tutorials/nuimages_tutorial.py # skip until PR-440 merged xvfb-run python python-sdk/tutorials/can_bus_tutorial.py xvfb-run python python-sdk/tutorials/map_expansion_tutorial.py xvfb-run python python-sdk/tutorials/prediction_tutorial.py