Skip to content

Commit

Permalink
more
Browse files Browse the repository at this point in the history
  • Loading branch information
amitkparekh committed Dec 4, 2023
1 parent 8026b4c commit 067c422
Show file tree
Hide file tree
Showing 4 changed files with 24 additions and 193 deletions.
72 changes: 9 additions & 63 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,89 +40,35 @@

The object detector and feature extractor used for the EMMA project.

A clean and scalable [PyTorch Lightning and Hydra Template](https://github.com/ashleve/lightning-hydra-template) for deep learning research, with the simplicity of [Hypermodern Python](https://github.com/cjolowicz/cookiecutter-hypermodern-python) tooling.

Assuming you have [pyenv](https://github.com/pyenv/pyenv) and [Poetry](https://python-poetry.org/), clone the repository and run:

```bash
# Use Python 3.9.9 in the project
pyenv local 3.9.9

# Tell Poetry to use pyenv
poetry env use $(pyenv which python)

# Install dependencies
poetry install

# Activate the virtual environment
poetry shell

# Install pre-commit hooks
pre-commit install
```

## Installing things

We've tried to keep necessary things as simplistic as possible. However, we need to install some things.
## Writing code and running things

### Poetry
### Run the server for the [Alexa Arena](https://github.com/amazon-science/alexa-arena)

This project uses Poetry for **creating virtual environments** and **managing Python packages**. This should be installed globally and can be done by running:
Running this command as is will automatically download and use the [fine-tuned checkpoint from our HF models repo](https://huggingface.co/emma-heriot-watt/models/blob/main/vinvl_finetune_arena.ckpt) and use the same settings we used when we ran experiments within the [Alexa Arena](https://github.com/amazon-science/alexa-arena).

```bash
curl -sSL https://install.python-poetry.org | python3 -
python src/emma_perception/commands/run_server.py
```

You can verify it's installed and accessible by running `poetry --version`.
### Extracting features


Once you've got Poetry installed, we think it's best to install Python dependencies into a `.venv/` folder within the cloned repo. Tell Poetry to handle this for you:
#### For the pretrained datasets

```bash
poetry config virtualenvs.in-project true
```

For more on how to manage, add, remove, and update dependencies, see the [official Poetry documentation](https://python-poetry.org/docs/basic-usage/).

### Managing Python versions...

There are two ways of managing your Python environments. We recommend [pyenv](https://github.com/pyenv/pyenv), but we have also included instructions for [Anaconda](https://anaconda.com).
#### For the Alexa Arena

#### ...with pyenv

Install pyenv following the [instructions within the official repo](https://github.com/pyenv/pyenv#installation) for your system. **Remember to do step 2**!

You can verify it's installed with `pyenv --version`.

1. Install the Python version you want with `pyenv install 3.9.9`
2. Go to the cloned repo
3. Assign the specific Python version to the project by running `pyenv local 3.9.9`

If you want a different version of Python, just change the version in the steps.

#### ...with Anaconda

Install Anaconda using the [instructions on the official website](https://anaconda.com/).

Then create an environment for your project by running:

```bash
conda create -n PROJECT_NAME python=3.9
conda activate PROJECT_NAME
```

## Writing code and running things

### Project structure

This is organised in very similarly to structure from the [Lightning-Hydra-Template](https://github.com/ashleve/lightning-hydra-template#project-structure) to facilitate reproducible research code.

- `scripts``sh` scripts to run experiments
- `configs` — configurations files using the [Hydra framework](https://hydra.cc/)
- `notebooks` — Jupyter notebook for analysis and exploration
- `storage` — data for training/inference _(and maybe use symlinks to point to other parts of the filesystem)_
- `src` — where the main code lives

## Developer tooling
### Developer tooling

- Dependency management with [Poetry](https://python-poetry.org/)
- Easier task running with [Poe the Poet](https://github.com/nat-n/poethepoet)
Expand Down
16 changes: 13 additions & 3 deletions src/emma_perception/commands/download_checkpoints.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,13 +5,23 @@


HF_REPO_ID = "emma-heriot-watt/models"
CHECKPOINT_NAME = "vinvl_finetune_arena.ckpt"
VINVL_CHECKPOINT_NAME = "vinvl_pretrained.ckpt"
ARENA_CHECKPOINT_NAME = "vinvl_finetune_arena.ckpt"


def download_arena_checkpoint(
*, hf_repo_id: str = HF_REPO_ID, file_name: str = ARENA_CHECKPOINT_NAME
) -> Path:
"""Download the fine-tuned checkpoint on the Alexa Arena."""
file_path = download_file(repo_id=hf_repo_id, repo_type="model", filename=file_name)
logger.info(f"Downloaded {file_name}")
return file_path


def download_vinvl_checkpoint(
*, hf_repo_id: str = HF_REPO_ID, file_name: str = CHECKPOINT_NAME
*, hf_repo_id: str = HF_REPO_ID, file_name: str = ARENA_CHECKPOINT_NAME
) -> Path:
"""Download the checkpoint from VinVL and put it where we expect it."""
"""Download the pre-trained VinVL checkpoint."""
file_path = download_file(repo_id=hf_repo_id, repo_type="model", filename=file_name)
logger.info(f"Downloaded {file_name}")
return file_path
125 changes: 0 additions & 125 deletions src/emma_perception/commands/extract_visual_features_forced.py

This file was deleted.

4 changes: 2 additions & 2 deletions src/emma_perception/commands/run_server.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
from scene_graph_benchmark.config import sg_cfg

from emma_perception.api import ApiSettings, ApiStore, extract_features_for_batch, parse_api_args
from emma_perception.commands.download_checkpoints import download_vinvl_checkpoint
from emma_perception.commands.download_checkpoints import download_arena_checkpoint
from emma_perception.constants import (
SIMBOT_ENTITY_MLPCLASSIFIER_CLASSMAP_PATH,
SIMBOT_ENTITY_MLPCLASSIFIER_PATH,
Expand All @@ -38,7 +38,7 @@ async def startup_event() -> None:

model_path = Path(cfg.MODEL.WEIGHT)
if not model_path.exists():
cfg.MODEL.WEIGHT = download_vinvl_checkpoint().as_posix()
cfg.MODEL.WEIGHT = download_arena_checkpoint().as_posix()

if torch.cuda.is_available() and settings.device_id != -1:
num_gpus = torch.cuda.device_count()
Expand Down

0 comments on commit 067c422

Please sign in to comment.