more

emma-heriot-watt · Dec 4, 2023 · 067c422 · 067c422
1 parent 8026b4c
commit 067c422
Show file tree

Hide file tree

Showing 4 changed files with 24 additions and 193 deletions.
diff --git a/README.md b/README.md
@@ -40,89 +40,35 @@
 
 The object detector and feature extractor used for the EMMA project.
 
-A clean and scalable [PyTorch Lightning and Hydra Template](https://github.com/ashleve/lightning-hydra-template) for deep learning research, with the simplicity of [Hypermodern Python](https://github.com/cjolowicz/cookiecutter-hypermodern-python) tooling.
-
 Assuming you have [pyenv](https://github.com/pyenv/pyenv) and [Poetry](https://python-poetry.org/), clone the repository and run:
 
-```bash
-# Use Python 3.9.9 in the project
-pyenv local 3.9.9
-
-# Tell Poetry to use pyenv
-poetry env use $(pyenv which python)
-
-# Install dependencies
-poetry install
-
-# Activate the virtual environment
-poetry shell
-
-# Install pre-commit hooks
-pre-commit install
-```
-
-## Installing things
-
-We've tried to keep necessary things as simplistic as possible. However, we need to install some things.
+## Writing code and running things
 
-### Poetry
+### Run the server for the [Alexa Arena](https://github.com/amazon-science/alexa-arena)
 
-This project uses Poetry for **creating virtual environments** and **managing Python packages**. This should be installed globally and can be done by running:
+Running this command as is will automatically download and use the [fine-tuned checkpoint from our HF models repo](https://huggingface.co/emma-heriot-watt/models/blob/main/vinvl_finetune_arena.ckpt) and use the same settings we used when we ran experiments within the [Alexa Arena](https://github.com/amazon-science/alexa-arena).
 
 ```bash
-curl -sSL https://install.python-poetry.org | python3 -
+python src/emma_perception/commands/run_server.py
 ```
 
-You can verify it's installed and accessible by running `poetry --version`.
+### Extracting features
+
 
-Once you've got Poetry installed, we think it's best to install Python dependencies into a `.venv/` folder within the cloned repo. Tell Poetry to handle this for you:
+#### For the pretrained datasets
 
 ```bash
-poetry config virtualenvs.in-project true
 ```
 
-For more on how to manage, add, remove, and update dependencies, see the [official Poetry documentation](https://python-poetry.org/docs/basic-usage/).
-
-### Managing Python versions...
 
-There are two ways of managing your Python environments. We recommend [pyenv](https://github.com/pyenv/pyenv), but we have also included instructions for [Anaconda](https://anaconda.com).
+#### For the Alexa Arena
 
-#### ...with pyenv
-
-Install pyenv following the [instructions within the official repo](https://github.com/pyenv/pyenv#installation) for your system. **Remember to do step 2**!
-
-You can verify it's installed with `pyenv --version`.
-
-1. Install the Python version you want with `pyenv install 3.9.9`
-2. Go to the cloned repo
-3. Assign the specific Python version to the project by running `pyenv local 3.9.9`
-
-If you want a different version of Python, just change the version in the steps.
-
-#### ...with Anaconda
-
-Install Anaconda using the [instructions on the official website](https://anaconda.com/).
-
-Then create an environment for your project by running:
 
 ```bash
-conda create -n PROJECT_NAME python=3.9
-conda activate PROJECT_NAME
 ```
 
-## Writing code and running things
-
-### Project structure
-
-This is organised in very similarly to structure from the [Lightning-Hydra-Template](https://github.com/ashleve/lightning-hydra-template#project-structure) to facilitate reproducible research code.
-
-- `scripts` — `sh` scripts to run experiments
-- `configs` — configurations files using the [Hydra framework](https://hydra.cc/)
-- `notebooks` — Jupyter notebook for analysis and exploration
-- `storage` — data for training/inference _(and maybe use symlinks to point to other parts of the filesystem)_
-- `src` — where the main code lives
 
-## Developer tooling
+### Developer tooling
 
 - Dependency management with [Poetry](https://python-poetry.org/)
 - Easier task running with [Poe the Poet](https://github.com/nat-n/poethepoet)

diff --git a/src/emma_perception/commands/download_checkpoints.py b/src/emma_perception/commands/download_checkpoints.py
@@ -5,13 +5,23 @@
 
 
 HF_REPO_ID = "emma-heriot-watt/models"
-CHECKPOINT_NAME = "vinvl_finetune_arena.ckpt"
+VINVL_CHECKPOINT_NAME = "vinvl_pretrained.ckpt"
+ARENA_CHECKPOINT_NAME = "vinvl_finetune_arena.ckpt"
+
+
+def download_arena_checkpoint(
+    *, hf_repo_id: str = HF_REPO_ID, file_name: str = ARENA_CHECKPOINT_NAME
+) -> Path:
+    """Download the fine-tuned checkpoint on the Alexa Arena."""
+    file_path = download_file(repo_id=hf_repo_id, repo_type="model", filename=file_name)
+    logger.info(f"Downloaded {file_name}")
+    return file_path
 
 
 def download_vinvl_checkpoint(
-    *, hf_repo_id: str = HF_REPO_ID, file_name: str = CHECKPOINT_NAME
+    *, hf_repo_id: str = HF_REPO_ID, file_name: str = ARENA_CHECKPOINT_NAME
 ) -> Path:
-    """Download the checkpoint from VinVL and put it where we expect it."""
+    """Download the pre-trained VinVL checkpoint."""
     file_path = download_file(repo_id=hf_repo_id, repo_type="model", filename=file_name)
     logger.info(f"Downloaded {file_name}")
     return file_path
diff --git a/src/emma_perception/commands/extract_visual_features_forced.py b/src/emma_perception/commands/extract_visual_features_forced.py
diff --git a/src/emma_perception/commands/run_server.py b/src/emma_perception/commands/run_server.py
@@ -12,7 +12,7 @@
 from scene_graph_benchmark.config import sg_cfg
 
 from emma_perception.api import ApiSettings, ApiStore, extract_features_for_batch, parse_api_args
-from emma_perception.commands.download_checkpoints import download_vinvl_checkpoint
+from emma_perception.commands.download_checkpoints import download_arena_checkpoint
 from emma_perception.constants import (
     SIMBOT_ENTITY_MLPCLASSIFIER_CLASSMAP_PATH,
     SIMBOT_ENTITY_MLPCLASSIFIER_PATH,
@@ -38,7 +38,7 @@ async def startup_event() -> None:
 
     model_path = Path(cfg.MODEL.WEIGHT)
     if not model_path.exists():
-        cfg.MODEL.WEIGHT = download_vinvl_checkpoint().as_posix()
+        cfg.MODEL.WEIGHT = download_arena_checkpoint().as_posix()
 
     if torch.cuda.is_available() and settings.device_id != -1:
         num_gpus = torch.cuda.device_count()