Official code for ICCV 2023 paper: GAIT: Generating Aesthetic Indoor Tours with Deep Reinforcement Learning
Since we install habitat-sim
by building from source, the environment.yml
file we provide could not be directly used but can serve as reference.
- Start by creating an conda environment with python 3.7, as required by
habitat-sim
:$ conda create -n habitat python=3.7 cmake=3.14.0
- Then, install pytorch. If
habitat-sim
is installed before pytorch, then pytorch would be installed as cpu-only.conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch
- clone habitat-sim github.
- Edit the
habitat-sim
source code. By default, habitat-sim has "gravity" enabled, so if we move the agent upwards, it falls down to the ground by itself. We change its source code to disable this.$ cd habitat-sim/src_python/habitat_sim
.- Edit
simulator.py
. Search forstep(
, there are three overloaded methods calledstep()
. For each one, add a parameterapply_filter=True
. In the thirdstep()
definition, go to the line callingagent.act()
, and appendapply_filter=apply_filter
as a parameter. - Edit
agent/agent.py
. Search foract(
, appendapply_filter=True
as a parameter. In theact()
method, there are two calls ofself.controls.action()
. Appendapply_filter=apply_filter
to each of them.
- Install
habitat-sim
from source.habitat-sim
is available as a conda pacakge, but we have met some issues that could be solved by installing from source. Follow this guide fromhabitat-sim
Github.- If you have
nvcc
:$ export CUDACXX=/usr/local/cuda/bin/nvcc
. Otherwise, installnvcc
via conda and then export path to it:$sudo apt install libxml2
,$ conda install -c conda-forge cudatoolkit-dev
,$ export CUDACXX=/home/username/anaconda3/envs/habitat/pkgs/cuda-toolkit/bin/nvcc
sudo apt install build-essential
- For
$ python setup.py install
: use flag--with-cuda
, and include--headless
flag as needed.
- If you have
- While you are building
habitat-sim
from source, take the time to download theReplica
Dataset$ git clone https://github.com/facebookresearch/Replica-Dataset.git
$ sudo apt install pigz
- Make sure
Replica-Dataset
andgait
are in the same folder. Then$cd Replica-Dataset
,$./download.sh ../gait/habitat_data/Replica
$ wget http://dl.fbaipublicfiles.com/habitat/sorted_faces.zip
,$ unzip sorted_faces.zip
$./sorted_faces/copy_to_folders ~/aestheticview/habitat_data/Replica/
- Install
conda
dependencies inenvironment.yml
- Install
pip
dependencies$ pip install gym tensorboard dm-env termcolor cma
$ pip install -U ray
$ pip install hydra-core --upgrade
$ pip install hydra-submitit-launcher --upgrade
- Download the pretrained weights of the aesthetics model (View Evaluation Net) from https://github.com/zijunwei/ViewEvaluationNet, and put
EvaluationNet.pth.tar
ingait/snapshots/params
All hyperparameters are set to the default condition.
cfgs/config.yaml
contains the hyperparameters for GAIT environment and for DrQv2.
Specifically, ray
enables or disables multi-GPU training, diversity, smoothness, constant_noise, no_aug
enables/disables the corresponding ablation setting as their name suggests.
cfgs/task/single_train.yaml
specifies the algorithm (DrQv2 or CURL) and the scene.
cfgs/task/medium.yaml
specifies the Data Steps and Linearly decayed noise schedule for DrQv2)
cfgs/task/cma-es.yaml
specifies cma-es specific hyperparameters.
To train GAIT-DrQ-v2
$ bash drqtrain.sh
To train GAIT-CURL
$ bash curltrain.sh
To run CMA-ES
$ bash cmatrain.sh
Use the corresponding watch script to watch the training output, e.g. $ bash drqwatch.sh
We provide 1 pre-trained checkpoint corresponding to DrQ-v2 default room0
located in logs/drqv2_habitat/66 16 multi 1m decay 3m train room_0_
, containing only the actor network, and the training configurations.
To generate figures, sequence frames, and the corresponding interpolated videos in the paper with the pre-trained weights, first edit final_evaluation.py
to choose which figure, then
$ python final_evaluation.py
Note that for training and evaluation, hydra
will change the working directory to gait/logs/${algo}/${now:%Y.%m.%d}_${now:%H%M%S}
.
rlcam_drqv2_mql.py
andcurl_train.py
contains the main training code.drqv2_net.py
contains the pytorch network architecture definitions for the Actor and the Critic networks.habitat_test.py
contains our wrappers based onhabitat-sim
, which handles camera pose, computes reward, etc (Sections 3.1 and 3.2 of the paper.)drqv2/
andcurl/
contains the original code from DrQ-v2 and CURL respectively. Our network architectures defined indrqv2_net.py
are based ondrqv2/drqv2.py
.