- Development Setup
- Managing Dependencies
- Testing
- Submitting a Pull Request
- Registering a Version of the Algorithm
- Creating a Release
To contribute to this work, you must obtain access to the NASA MAAP, where the Algorithm Development Environment (ADE) resides, and thus where algorithms can be registered and launched.
If you wish to conduct development work outside of the ADE, you'll need to have the following installed in your environment (the ADE has these pre-installed):
git
: On macOS, using Homebrew is highly recommended:brew install git
. Otherwise, see https://git-scm.com/downloads.conda
: On macOS, using Homebrew to install Miniforge (containingconda
) is highly recommended:brew install --cask miniforge
. Otherwise, see https://github.com/conda-forge/miniforge.
You must also have make
installed. On macOS, make
should already be
installed. On Linux, if not already installed, you must locate installation
instructions appropriate for your system package manager.
In the ADE, make
is not installed by default (at least not yet; see
Install make
in all ADE images),
and must be installed as follows:
apt-get update && apt-get install make -y
To prepare for contributing, do the following on your development system (either within the ADE or wherever else you're conducting development):
- Clone this GitHub repository.
- Change directory to the cloned working directory.
- Create the
gedi_subset
conda environment:make build
- Install (and run) Git pre-commit hooks:
make lint
- If desired, activate the
gedi_subset
conda environment:conda activate gedi_subset
(NOTE: This is not necessary for running any of themake
commands, which automatically use thegedi_subset
environment even when the environment is not activated.)
To minimize the time it takes to install dependencies on the Docker image
created during algorithm registration, we leverage conda-lock
to pre-resolve
dependencies and generate a lock file, to avoid dependency resolution during
registration. This means that we manage dependencies and update the lock file
as follows:
- Dependencies required by the algorithm are specified in
environment.yml
. (These are the only dependencies that will be installed in DPS.) - Development dependencies (for testing, static code analysis, notebooks, etc.)
are specified in
environment-dev.yml
. - We use
conda-lock
to generate the lock fileconda-lock.yml
from the files above, whenever we make a change to at least one of the files.
Within environment.yml
and environment-dev.yml
we intentionally avoid
specifying exact (==
) versions to avoid resolution conflicts that might
prevent successful dependency resolution by giving the resolver enough
flexibility to satisfy all requirements. When versions are specified, they are
only to ensure we use versions with necessary features or bug fixes, and
typically use the compatible release operator (~=
) to constrain only the
major version of a dependency.
IMPORTANT: Whenever you make changes to either environment.yml
or
environment-dev.yml
, you must regenerate conda-lock.yml
and install the
updated dependencies into the gedi_subset
conda environment on your
development workstation. This is done automatically by running make build
,
which will also update the gedi_subset
environment as necessary.
Successfully running linting and testing locally should ensure that the GitHub Actions workflow triggered by your PR will succeed.
We leverage the vcrpy
library to record responses to HTTP/S requests. When
running existing tests, these recordings (cassettes in vcrp
parlance) are
replayed so that repeated test executions do not make repeated requests.
Therefore, if you are not adding or modifying such tests, there is no need to
have a network connection, nor any need to run the tests within the ADE.
However, since we currently use the maap-py
library for CMR queries, adding
new tests that make CMR queries (or modifying existing ones) will not only
require a network connection in order to record live responses, but will also
require that you obtain such recordings by running the new/modified tests within
the ADE in order to have the necessary auth in play. Otherwise, the CMR queries
will either fail, or produce incorrect responses.
Linting runs a number of checks on various files, such as making sure your code adheres to coding conventions and that the conda lock file is in sync with the conda environment files, among other things. To "lint" the files in the repo, as well as run unit tests, run the following:
make lint typecheck test
If you see any errors, address them and repeat the process until there are no errors.
Optionally, you may wish to locally test that the build for your future PR will
succeed. To do so, you can use act to locally run
GitHub Actions workflows. After installing act
, run the following command
from the root of the repo:
act pull_request
NOTE: act
uses a Docker container, so this will NOT work within the ADE.
You must use act
in an environment where Docker is installed.
The command above will initially take several minutes, but subsequent runs will
execute more quickly because only the first run must pull the act
Docker
image.
To work on a feature or bug fix, you'll generally want to follow these steps:
- Checkout and pull the lastest changes from the
main
branch. - Create a new branch from the
main
branch. - Add your desired code and/or configuration changes.
- Add appropriate entries to the Changelog, according to the
Keep a Changelog convention. See existing sections in the Changelog for
guidance on structure and format. In general, you should add entries under
the
Unreleased
section at the top. A release manager will relable theUnreleased
section to the appropriate release number upon the next release. - Register a version of the algorithm (see next section).
- Test your registered version, and repeat as necessary.
- Once you're satsified with your changes, delete your registered version.
- Submit a PR to the GitHub repository, targeting the
main
branch.
To register a new version of the algorithm, you must do so within the ADE, in order to obtain automatic authorization. If you have not been using the ADE for development, but want to register the algorithm, within the ADE you must clone and/or pull the latest code from the branch from which you want to register the algorithm.
Then (again, within the ADE), simply run the following to register the algorithm
configured in algorithm_config.yaml
:
bin/algo/register
When on the main
branch (typically only after creating a release of the
algorithm, as described in the next section), and the current commit (HEAD
) is
tagged, the script will check whether or not the value of algorithm_version
in
the YAML file matches the value of the git tag. If so, the YAML file will be
registered as-is. If not, the script will report the version-tag mismatch. A
match is expected when registering from the main
branch, as that's where
tagging/releasing should take place.
However, you will likely want to register a version of the algorithm from
another branch when testing your changes on the branch, before opening a Pull
Request. In this case, when registering from another branch, the script ignores
the value of algorithm_version
in the YAML file, and the script will instead
use the name of the current branch as the algorithm version during registration
(the YAML file is not modified).
Upon successful registration, you should see output similar to the following (abridged):
{
"code": 200,
"message": {
"id": "...",
...,
"title": "Registering algorithm: gedi-subset",
"message": "Registering algorithm: gedi-subset",
...,
"status": "created",
...,
"job_web_url": "https://repo.maap-project.org/root/register-job-hysds-v4/-/jobs/*****",
"job_log_url": "https://repo.maap-project.org/root/register-job-hysds-v4/-/jobs/*****/raw"
}
}
This indicates that the registration succeeded (code 200), and that the image
for the algorithm is being built. To see the progress of the build, open a
browser to the "job_web_url"
value shown in your output. Note that although
you may see a "success" response (as shown above), that simply indicates that
registration was successfully initiated, meaning that an image build was
successfully triggered. The image build process may fail, so it is important to
make sure the build succeeds. If it does, then the new version of the algorithm
should be visible in the Algorithm list on the form shown in the ADE after
choosing Jobs > Submit Jobs menu item.
If the corresponding output shown above shows an error, or it succeeds, but the image build fails, analyze the error message from the failed registration or failed build. If the output does not provide the information you need to correct the problem, reach out to the platform team for assistance.
Once the registration build succeeds, you may submit jobs against the algorithm.
For unreleased versions, once you're satisified that your unreleased version of the algorithm works properly, you should delete it as follows:
bin/algo/delete
Then create a Pull Request against the main
branch. If you need to make
adjustments to your branch, you can rerun registration to replace your
unreleased version of the algorithm as often as necessary until you're
satisfied.
After one or more Pull Requests have landed on the main
branch to constitute
a new release:
- Checkout the latest changes to the
main
branch. - Create a new branch named
release-<VERSION>
, where<VERSION>
is an appropriate version number for the changes being made, according to Semantic Versioning. - In
algorithm_config.yaml
change the value ofalgorithm_version
to the same value as<VERSION>
from the previous step. - In the Changelog, immediately below the
Unreleased
heading add a new heading (at the same level) using the format[<VERSION>] (<YYYY-MM-DD>)
(including the square brackets and parentheses), where<VERSION>
is as above, and<YYYY-MM-DD>
is the expected release date (which might not be the actual release date, depending on the PR approval process). - Also in the Changelog, at the bottom, add a link to the new release to the existing list of release links (in order), and update the Unreleased link accordingly as well.
- Commit the changes, and open a Pull Request to
main
. - Once the PR is approved and merged, go to https://github.com/MAAP-Project/gedi-subsetter/releases/new
- Click the Choose a tag dropdown.
- In the input box that appears, enter the same value as the value of
<VERSION>
from previous steps, and click the Create a new tag label that appears immediately below the input box. - In the Release title input, also enter the same value as the value of
<VERSION>
in the previous step. - In the description text box, copy and paste the content of only the new version section you added earlier to the Changelog, excluding the new version heading (since it would be redundant with the release title).
- Click the Publish release button.
- Checkout and pull the
main
branch in order to pull down the new tag created by the release process. - Register the new release of the algorithm as described in the previous section.