Skip to content

Latest commit

 

History

History
146 lines (117 loc) · 4.36 KB

File metadata and controls

146 lines (117 loc) · 4.36 KB

g2net-gravitational-wave-detection

PyTorch Lightning Config: Hydra Template

Description

Setup

COMPETITION_NAME='XXX'

git clone https://github.com/Ynakatsuka/$COMPETITION_NAME
cd $COMPETITION_NAME

# Credentials
cp .env.template .env
vim .env  # Set your credentials

# Download data
cd data/input/ && kaggle competitions download -c $COMPETITION_NAME && unzip $COMPETITION_NAME.zip  && cd ../..

# Create Docker container and execute
./bin/docker.sh

How to run

# train
python run.py
# train with specified config
python run.py augmentation=default
# train with specified parameters
python run.py trainer.model.params.backbone.name=densenet121

# inference
python run.py run=inference
pysen run lint
pysen run format

Others

Create a sweep over hyperparameters
# this will run 6 experiments one after the other,
# each with different combination of batch_size and learning rate
python run.py -m datamodule.batch_size=32,64,128 model.lr=0.001,0.0005

⚠️ Currently sweeps aren't failure resistant (if one job crashes than the whole sweep crashes), but it will be supported in future Hydra release.

Use automatic code formatting

Use pre-commit hooks to standardize code formatting of your project and save mental energy.
Simply install pre-commit package with:

pip install pre-commit

Next, install hooks from .pre-commit-config.yaml:

pre-commit install

After that your code will be automatically reformatted on every new commit.
Currently template contains configurations of black (python code formatting), isort (python import sorting), flake8 (python code analysis) and prettier (yaml formating).

To reformat all files in the project use command:

pre-commit run -a
Version control your data and models with DVC

Use DVC to version control big files, like your data or trained ML models.
To initialize the dvc repository:

dvc init

To start tracking a file or directory, use dvc add:

dvc add data/MNIST

DVC stores information about the added file (or a directory) in a special .dvc file named data/MNIST.dvc, a small text file with a human-readable format. This file can be easily versioned like source code with Git, as a placeholder for the original data:

git add data/MNIST.dvc data/.gitignore
git commit -m "Add raw data"
Support installing project as a package

It allows other people to easily use your modules in their own projects. Change name of the src folder to your project name and add setup.py file:

from setuptools import find_packages, setup

setup(
    name="src",  # you should change "src" to your project name
    version="0.0.0",
    description="Describe Your Cool Project",
    author="",
    author_email="",
    # replace with your own github project link
    url="https://github.com/ashleve/lightning-hydra-template",
    install_requires=["pytorch-lightning>=1.2.0", "hydra-core>=1.0.6"],
    packages=find_packages(),
)

Now your project can be installed from local files:

pip install -e .

Or directly from git repository:

pip install git+git://github.com/YourGithubName/your-repo-name.git --upgrade

So any file can be easily imported into any other file like so:

from project_name.models.mnist_model import MNISTLitModel
from project_name.datamodules.mnist_datamodule import MNISTDataModule

References