Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A3C LSTM GA for language grounding #7

Open
wants to merge 98 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 95 commits
Commits
Show all changes
98 commits
Select commit Hold shift + click to select a range
4554cd6
Added bounding_boxes_and_test_reward.py which allows experimentation …
beduffy Sep 29, 2018
afd03a3
Added microwave or mug navigation task to wrapper
beduffy Sep 30, 2018
0c84638
Fixed bounding boxes to use instance_detections2D attribute instead o…
beduffy Sep 30, 2018
12a8e9f
Fixed "class and object info not showing in metadata on reset of env"
beduffy Sep 30, 2018
8e16277
Started monitoring episode lengths (3rd figure) and time taken for mi…
beduffy Sep 30, 2018
81c7069
Added dense reward option for microwave task
beduffy Sep 30, 2018
66d5915
Added Devendra Chaplot's DeepRL grounding A3C model
beduffy Oct 6, 2018
072e754
Added natural language option to environment with 1 task
beduffy Oct 6, 2018
f84b440
Added many changes! added todos, added training of A3C_GA version
beduffy Oct 6, 2018
1d9b78c
Put all plots into utils common function and used for both models.
beduffy Oct 7, 2018
b86e5a1
Added checkpointing and resuming in experiment folder
beduffy Oct 7, 2018
a2ce6e8
Added script to read latest 3 plots saved in png
beduffy Oct 7, 2018
94d193c
Added entity features into state information. 27 features. Created ne…
beduffy Oct 7, 2018
75f64d7
Added two sentence tasks (turn left or turn right 3 times)
beduffy Oct 11, 2018
1a5bd49
Removed reward clamp Saved action probabilities plot every episode
beduffy Oct 16, 2018
90bd31b
Added go and look and microwave and cup tasks and test case to check
beduffy Oct 16, 2018
9a02df5
Fixed massive bug of not loading new sentence and much more
beduffy Oct 19, 2018
828d950
Added single word instruction sentences and fixed a bug with current …
beduffy Oct 19, 2018
fb1ddcc
Added check for empty arrays for plotting newer graphs
beduffy Oct 21, 2018
b212284
Tried LR 0.0003, Added experiments text file, saved episode number
beduffy Oct 21, 2018
a5cebf2
Added checkpoint and first experiment folder. Fixed latest plot blug.
beduffy Oct 24, 2018
9977a2a
Added multi-threading, save total length/number of episodes, counter,…
beduffy Oct 28, 2018
7b8b87f
Fixed checkpoint_counter for new experiment, some todos and comments
beduffy Nov 1, 2018
cb25429
Added .gitignore for experiments
beduffy Nov 1, 2018
a43ea64
Add Rainbow DQN code
fernandotorch Nov 4, 2018
324a0f0
Some comments, style and prognostications
beduffy Nov 6, 2018
1e64989
Merge branch 'A3C_barebones' into Rainbow-DQN
fernandotorch Nov 17, 2018
44da993
Fix channel dimension from state
fernandotorch Nov 17, 2018
b419713
Merge branch 'A3C_LSTM_GA_two_sentence' into A3C_LSTM_GA_barebones
beduffy Nov 17, 2018
3acecaf
Added A3C GA model to model.py and removed 3 files I won't need (good…
beduffy Nov 17, 2018
3e52ba7
Import A3C_LSTM, added natural language param, some copying into trai…
beduffy Jan 29, 2019
1dd0a72
Added cameraY, gridSize, incremental_rotation_mode with 10 degree rot…
beduffy Jan 29, 2019
89bc411
Added new task example for agent on the ground with cups. Added camer…
beduffy Jan 30, 2019
b751b27
Added continuous mode to initialise of ai2thor which finally allowed …
beduffy Jan 31, 2019
1bb10e9
Added argparse to continuous example for build path, default false in…
beduffy Feb 2, 2019
11fa2c4
NLP instructions working. Added natural_language if/else statements, …
beduffy Feb 2, 2019
23bff3a
Merge branch 'master' of https://github.com/TheMTank/ai2thor-experime…
beduffy Feb 3, 2019
e7e2d30
Renamed to Bowl and Mug, passed curr_object_type in event to task, ex…
beduffy Feb 3, 2019
9feb73f
Added uuid eid args and old checkpointing, todos, references to old c…
beduffy Feb 3, 2019
318e54f
save_checkpoint function with os paths (called in args num_steps), cl…
beduffy Feb 3, 2019
6889f30
Deleted 8 old files. Old wrapper finally gone (entity feats within if…
beduffy Feb 3, 2019
c133832
Added option to disable lookupdown actions, added option to do num ra…
beduffy Feb 3, 2019
14379ad
Added NaturalLanguageLookAtObjectTask, NaturalLanguageNavigateToObjec…
beduffy Feb 5, 2019
011332e
Added a new large test case for naturalLanguageLookAtTask which tests…
beduffy Feb 6, 2019
2d13a8f
Added two examples, inbuilt_interactive_mode.py with argparse for uni…
beduffy Feb 6, 2019
97fbe3a
Removed natural_language_instructions from config since it's decided …
beduffy Feb 6, 2019
6d2eb98
Moved show_bounding_boxes() to task_utils.py with extra features, rai…
beduffy Feb 6, 2019
90932b0
Fixed merge conflicts from merging cozmo_env. ai2thor_env.py
beduffy Feb 6, 2019
97743c7
Made sure grayscale on/off worked, added args.task_name, removed all …
beduffy Feb 6, 2019
655d9b1
Added writer to train and saved scalars into tensorboard, added cozmo…
beduffy Feb 8, 2019
ee3a9eb
Fixed max_episode_length bug and passed kwargs correctly all the way …
beduffy Feb 10, 2019
83ec524
Added NaturalLanguagePickUpObjectTask, Added movement_reward, increme…
beduffy Feb 10, 2019
d033b34
Added 3 extra config files (for specific task variations) and renamed…
beduffy Feb 10, 2019
2138ac9
Fixed test.py for both A3C+A3C_GA, added negative reward for picking …
beduffy Feb 10, 2019
a51106b
Fixed reset bug by setting step_num to 0, changed to build_file_name …
beduffy Feb 10, 2019
a6eeeaf
Added calculate_lstm_input_size_for_A3C_LSTM_GA with hardcoded factor…
beduffy Feb 11, 2019
5507373
Changed task random walk example to take build_file_name and auto-cal…
beduffy Feb 11, 2019
cea21a3
Fixed off by one episode_length time embedding by incrementing after …
beduffy Feb 12, 2019
2ffe6f3
Added test case for NaturalLanguagePickUpObjectTask, removed 3 ai2tho…
beduffy Feb 13, 2019
d12e6c5
Renamed incremental_rotation to continuous_movement, added a few comm…
beduffy Feb 16, 2019
0e167f2
Added todos, printed elapsed training time and time for 20 steps only…
beduffy Feb 17, 2019
eca78bf
Added num-random-actions-at-init to argparse, max_episode_length back…
beduffy Feb 18, 2019
9e4ba68
Added verbose-num-steps arg, printed tensorboard command, shutil impo…
beduffy Feb 19, 2019
132df1b
Cleaned comments, removed some todos, changed to floorplan2 in rotate…
beduffy Feb 20, 2019
de96042
Cleaned+added comments and removed some todos
beduffy Feb 21, 2019
6edc1e3
Fixed multi-word sentence bug, rotate_only will have multi-word sente…
beduffy Feb 22, 2019
97eada7
Removed todos regarding predicted value spike (didn't find reason), r…
beduffy Feb 23, 2019
64e57e1
Fix merge conflicts from merging cozmo_env into A3C_GA. continuous_mo…
beduffy Feb 23, 2019
b29c93e
Fixed test() function, renamed file, added many comments, fixed NL_pi…
beduffy Feb 23, 2019
7668186
Renamed avg_rewards to avg_episode_returns, added euclidean distance …
beduffy Feb 23, 2019
ce4dd9a
Added 4 vizdoom_data files, 1 vizdoom_maps room.wad file, 3 python vi…
beduffy Feb 24, 2019
8a8b1a3
Renamed envs to env_atari.py and took out vizdoom part into env_vizdo…
beduffy Feb 24, 2019
ea83ca5
Got VizDoom working in main.py and model.p.y Added ViZDoom argparams,…
beduffy Feb 24, 2019
09e052d
Added random scenes on reset option within task, build_bowls_vs_cups_…
beduffy Feb 24, 2019
7bde258
Added a3c/main.py variants to README.md with more info on natural lan…
beduffy Feb 24, 2019
7796fd7
Got train and test working in vizdoom, added beautiful unpack_state()…
beduffy Feb 26, 2019
936027f
Got VizDoom running from any folder by converting relative paths to a…
beduffy Feb 26, 2019
1a1548a
Fixed glob for latest checkpoint, fixed ai2thor inbuilt interactive b…
beduffy Feb 26, 2019
2b83816
Added complicated logic for args.resume-latest-config (default is no …
beduffy Feb 27, 2019
cdf2df9
Removed args.task-name, removed Exception as e, tensorboard creation …
beduffy Feb 28, 2019
d1a89c0
Added NaturalLanguagePickUpMultipleObjectTask with no terminal with c…
beduffy Feb 28, 2019
5569f3f
Added num_backprops to checkpoint and reload, checked for "and not" p…
beduffy Mar 2, 2019
b8d85a6
Added many comments to A3C train code to aid understanding, reshuffle…
beduffy Mar 3, 2019
5456517
Added warning for empty acceptable_receptacles config param and fixed…
beduffy Mar 3, 2019
8e4320d
Fixed self.controller.start() indent bug, added missing cv2 import to…
beduffy Mar 4, 2019
87713bf
Merge branch 'master' into A3C_LSTM_GA_barebones with cups-rl name ch…
beduffy Mar 7, 2019
9c7b08f
A3C calculate_lstm_input_size_for_A3C() doesn't need square images b…
beduffy Mar 7, 2019
e2cadb1
Made README.md much better (split into subsections and gave contructo…
beduffy Mar 8, 2019
2001282
Added EPIC 3 figures of ASCII art for A3C_LSTM_GA architecture and ot…
beduffy Mar 9, 2019
5dc2fd5
Added ActorCritic ASCII chart, cleaned up comments in model.py and ma…
beduffy Mar 9, 2019
396ff42
Added 13 todos to main.py, fixed big bug on storing all_rewards_in_ep…
beduffy Mar 10, 2019
06f5478
Put input_size functions as static methods for A3C and A3C_LSTM_GA, A…
beduffy Mar 14, 2019
639d9c7
Added show_instance_segmentation (binary mask matplotlib image on spe…
beduffy Mar 17, 2019
dd338a1
Added _process_frame42 as staticmethod in env_atari.py. In env_vizdoo…
beduffy Mar 17, 2019
e3da7c9
Removed seed init param from ai2thor_env, added lastObjectClosed, las…
beduffy Mar 20, 2019
af95bab
Fixed if indent on open and close objects, simplified logic into bool…
beduffy Mar 23, 2019
922d691
Renamed to has_language_instructions, todos, notes
beduffy Apr 5, 2019
70aaf76
Added more todos, lots of unfinished stuff (pushing for remote work p…
beduffy Apr 6, 2019
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@

\.idea/

*.pyc
/experiments/*
beduffy marked this conversation as resolved.
Show resolved Hide resolved
48 changes: 35 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,24 +18,30 @@ More detailed information on ai2thor environment can be found on their

<div align="center">
<img src="docs/bowls_fp_404_compressed_gif.gif" width="294px" />
<p>A3C agent learning during training on NaturalLanguagePickUpMultipleObjectTask in one of our customized scenes and tasks with the target object being CUPS!</p>
<p>A3C agent training on NaturalLanguagePickUpMultipleObjectTask in one of our customized scenes and tasks with the target object being CUPS!</p>
</div>

## Overview
## Running algorithms on ai2thor

This project will include implementations and adaptations of the following papers as a benchmark of
the current state of the art approaches to the problem:

- [Ikostrikov's A3C](https://github.com/ikostrikov/pytorch-a3c)
- [A3C](https://arxiv.org/abs/1602.01783) [Code from Ikostrikov](https://github.com/ikostrikov/pytorch-a3c)
- [Gated-Attention Architectures for Task-Oriented Language Grounding](https://arxiv.org/abs/1706.07230)
-- *Original code available on [DeepRL-Grounding](https://github.com/devendrachaplot/DeepRL-Grounding)*
also based on Ikostrikov's A3C
-- A3C with gated attention (A3C_LSTM_GA) *Original code available on [DeepRL-Grounding](https://github.com/devendrachaplot/DeepRL-Grounding)*
also based on A3C made by Ikostrikov.

Implementations of these can be found in the algorithms folder and a3c can be run on AI2ThorEnv with:
`python algorithms/a3c/main.py`
- `python algorithms/a3c/main.py`
- For running a config file which is set to the BowlsVsCups variant of the NaturalLanguagePickUpObjectTask in tasks.py for running A3C_LSTM_GA model:
`python algorithms/a3c/main.py --config-file-name NL_pickup_bowls_vs_cups_fp1_config.json --verbose-num-steps True --num-random-actions-at-init 4`
- For running [ViZDoom](https://github.com/mwydmuch/ViZDoom) (you will need to install ViZDoom) synchronous with 1 process:
`python algorithms/a3c/main.py --verbose-num-steps True --sync --vizdoom -v 1`
- For running atari with 8 processes:
`python algorithms/a3c/main.py --atari --num-processes 8`

Check the argparse help for more details and variations of running the algorithm with different
hyperparams and on the atari environment as well.
For A3C's `-eid` param you can specify experiment names which will create folders for checkpointing and hyperparameters, otherwise experiment name is the current date and a concatenated random guid. Check the argparse help for more details and variations of running the algorithm with different
hyperparams.

## Installation

Expand Down Expand Up @@ -74,9 +80,11 @@ for episode in range(N_EPISODES):

### Environment and Task configurations

##### JSON config files and config_dict

The environment is typically defined by a JSON configuration file located on the `gym_ai2thor/config_files`
folder. You can find an example `config_example.json` to see how to customize it. Here there is one
as well:
folder. You can find a full example at `default_config.json` to see how to customize it. Here there is
another one as well:

```
# gym_ai2thor/config_files/myconfig.json
Expand All @@ -86,6 +94,8 @@ as well:
'acceptable_receptacles': ['CounterTop', 'TableTop', 'Sink'],
'openable_objects': ['Microwave'],
'scene_id': 'FloorPlan28',
'gridSize': 0.1,
'continuous_movement': true,
'grayscale': True,
'resolution': (300, 300),
'task': {'task_name': 'PickUp',
Expand All @@ -95,7 +105,11 @@ as well:
For experimentation it is important to be able to make slight modifications of the environment
without having to create a new config file each time. The class `AI2ThorEnv` includes the keyword
argument `config_dict`, that allows to input a python dictionary **in addition to** the config file
that overrides the parameters described in the config.
that overrides the parameters described in the config. In summary, the full interface to the constructor:

`env = AI2ThorEnv(env = AI2ThorEnv(config_file=config_file_name, config_dict=config_dict))`

##### Tasks and TaskFactory

The tasks are defined in `envs/tasks.py` and allow for particular configurations regarding the
rewards given and termination conditions for an episode. You can use the tasks that we defined
Expand Down Expand Up @@ -128,11 +142,19 @@ class MoveAheadTask(BaseTask):

def reset(self):
self.step_num = 0
```
```

Some tasks allow you return extra state by filling in the get_extra_state() function (e.g. for returning a Natural Language instruction within the state). Again, check
tasks.py for more details.

##### Examples and Task variants

We encourage you to explore the scripts on the `examples` folder to guide you on the wrapper
functionalities and explore how to create more customized versions of ai2thor environments and
tasks.

And most importantly, config files and tasks can be combined together to form **Task variants** e.g. NaturalLanguagePickUpObjectTask but only allowing
cups and bowls to be picked up hence: `gym_ai2thor/config_files/NL_pickup_bowls_vs_cups_fp1_config.json`

Here is the desired result of an example task in which the goal of the agent is to place a cup in the
sink.
Expand All @@ -145,7 +167,7 @@ sink.

## The Team

[The M Tank](http://www.themtank.org/) is a non-partisan organisation that works solely to recognise the multifaceted
[MTank](http://www.themtank.org/) is a non-partisan organisation that works solely to recognise the multifaceted
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not an issue for you here but in general. We should change the message from MTank to something more related to what we do now. This seems very outdated

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, i can do it here if we think of something

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

opened issue in slack

nature of Artificial Intelligence research and to highlight key developments within all sectors affected by these
advancements. Through the creation of unique resources, the combination of ideas and their provision to the public,
this project hopes to encourage the dialogue which is beginning to take place globally.
Expand Down
40 changes: 23 additions & 17 deletions algorithms/a3c/envs.py → algorithms/a3c/env_atari.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,17 @@
This contains auxiliary wrappers for the atari openAI gym environment e.g. proper resizing of the
input frame and a running average normalisation of said frame after resizing
"""

from __future__ import print_function

import cv2
beduffy marked this conversation as resolved.
Show resolved Hide resolved
import gym
import numpy as np
from gym.spaces.box import Box
from gym import spaces

# -----------------
# Atari preprocessing and wrappers below
# -----------------

# Taken from https://github.com/openai/universe-starter-agent
def create_atari_env(env_id):
Expand All @@ -18,27 +24,27 @@ def create_atari_env(env_id):
return env


def _process_frame42(frame):
frame = frame[34:34 + 160, :160]
# Resize by half, then down to 42x42 (essentially mipmapping). If
# we resize directly we lose pixels that, when mapped to 42x42,
# aren't close enough to the pixel boundary.
frame = cv2.resize(frame, (80, 80))
frame = cv2.resize(frame, (42, 42))
frame = frame.mean(2, keepdims=True)
frame = frame.astype(np.float32)
frame *= (1.0 / 255.0)
frame = np.moveaxis(frame, -1, 0)
return frame


class AtariRescale42x42(gym.ObservationWrapper):
beduffy marked this conversation as resolved.
Show resolved Hide resolved
def __init__(self, env=None):
super(AtariRescale42x42, self).__init__(env)
self.observation_space = Box(0.0, 1.0, [1, 42, 42])
self.observation_space = spaces.Box(0.0, 1.0, [1, 42, 42])

@staticmethod
def _process_frame42(frame):
frame = frame[34:34 + 160, :160]
# Resize by half, then down to 42x42 (essentially mipmapping). If
# we resize directly we lose pixels that, when mapped to 42x42,
# aren't close enough to the pixel boundary.
frame = cv2.resize(frame, (80, 80))
frame = cv2.resize(frame, (42, 42))
frame = frame.mean(2, keepdims=True)
frame = frame.astype(np.float32)
frame *= (1.0 / 255.0)
frame = np.moveaxis(frame, -1, 0)
return frame

def _observation(self, observation):
return _process_frame42(observation)
return self._process_frame42(observation)


class NormalizedEnv(gym.ObservationWrapper):
Expand Down
Loading