Skip to content

Commit

Permalink
Merge pull request #12 from danielfromearth/feature/rename-to-stitchee
Browse files Browse the repository at this point in the history
rename from bumblebee to stitchee
  • Loading branch information
danielfromearth authored Sep 13, 2023
2 parents 4e5af7c + 736cc96 commit 9c607c7
Show file tree
Hide file tree
Showing 12 changed files with 161 additions and 135 deletions.
9 changes: 5 additions & 4 deletions .github/workflows/lint_and_test_and_bump.yml
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,7 @@ jobs:
echo "software_version=$(poetry version | awk '{print $2}')" >> $GITHUB_ENV
echo "venue=ops" >> $GITHUB_ENV
- name: Install bumblebee
- name: Install stitchee
run: poetry install

- name: Lint
Expand All @@ -87,7 +87,8 @@ jobs:
- name: Test with pytest
run: |
poetry run pytest
poetry run pytest tests/test_group_handling.py
# TODO: expand tests to include full concatenation runs, i.e., don't just run test_group_handling.py

# - name: Commit Version Bump
# # If building develop, a release branch, or main then we commit the version bump back to the repo
Expand All @@ -96,8 +97,8 @@ jobs:
# github.ref == 'refs/heads/main' ||
# startsWith(github.ref, 'refs/heads/release')
# run: |
# git config --global user.name 'bumblebee bot'
# git config --global user.email 'bumblebee@noreply.github.com'
# git config --global user.name 'stitchee bot'
# git config --global user.email 'stitchee@noreply.github.com'
# git commit -am "/version ${{ env.software_version }}"
# git push
#
Expand Down
5 changes: 3 additions & 2 deletions .github/workflows/lint_and_test_on_pull_request.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ jobs:
with:
poetry-version: 1.3.2

- name: Install bumblebee
- name: Install stitchee
run: poetry install

- name: Lint
Expand All @@ -35,4 +35,5 @@ jobs:
- name: Test with pytest
run: |
poetry run pytest
poetry run pytest tests/test_group_handling.py
# TODO: expand tests to include full concatenation runs, i.e., don't just run test_group_handling.py
16 changes: 16 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# Changelog
All notable changes to this project will be documented in this file.

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [Unreleased]

### Added
- [PR #1](https://github.com/danielfromearth/stitchee/pull/1): An initial GitHub Actions workflow
### Changed
- [PR #12](https://github.com/danielfromearth/stitchee/pull/12): Changed name to "stitchee"
### Deprecated
### Removed
### Fixed
- [PR #4](https://github.com/danielfromearth/stitchee/pull/4): Error with TEMPO ozone profile data because of duplicated dimension names
45 changes: 29 additions & 16 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,19 +1,21 @@
# bumblebee
[<img src="https://github.com/danielfromearth/stitchee/assets/114174502/58052dfa-b6e1-49e5-96e5-4cb1e8d14c32" width="250"/>](stitchee_9_hex)

Tool for concatenating netCDF data *along an existing dimension*,
which is deigned as both a standalone utility and
for use as a service in [Harmony](https://harmony.earthdata.nasa.gov/).
# Overview
_____

_STITCHEE_ (STITCH by Extending a dimEnsion) is used for concatenating netCDF data *along an existing dimension*,
and it is deigned as both a standalone utility and for use as a service in [Harmony](https://harmony.earthdata.nasa.gov/).

## Getting started, with poetry

1. Follow the instructions for installing `poetry` [here](https://python-poetry.org/docs/).
2. Install `bumblebee`, with its dependencies, by running the following from the repository directory:
2. Install `stitchee`, with its dependencies, by running the following from the repository directory:

```shell
poetry install
```

## How to test `bumblebee` locally
## How to test `stitchee` locally

```shell
poetry run pytest tests/
Expand All @@ -22,25 +24,36 @@ poetry run pytest tests/
## Usage (with poetry)

```shell
$ poetry run bumblebee --help
usage: bumblebee [-h] [--make_dir_copy] [-v] data_dir output_path
$ poetry run stitchee --help
usage: stitchee [-h] -o output_path [--concat_dim concat_dim] [--make_dir_copy] [--keep_tmp_files] [-O] [-v]
path/directory or path list [path/directory or path list ...]

Run the along-existing-dimension concatenator.

positional arguments:
data_dir The directory containing the files to be merged.
output_path The output filename for the merged output.

options:
-h, --help show this help message and exit
--make_dir_copy Make a duplicate of the input directory to avoid modification of input files. This is useful for testing, but uses more disk space.
-v, --verbose Enable verbose output to stdout; useful for debugging
-h, --help show this help message and exit
--concat_dim concat_dim
Dimension to concatenate along, if possible.
--make_dir_copy Make a duplicate of the input directory to avoid modification of input files. This is useful for testing, but
uses more disk space.
--keep_tmp_files Prevents removal, after successful execution, of (1) the flattened concatenated file and (2) the input
directory copy if created by '--make_dir_copy'.
-O, --overwrite Overwrite output file if it already exists.
-v, --verbose Enable verbose output to stdout; useful for debugging

Required:
path/directory or path list
Files to be concatenated, specified via a (1) single directory containing the files to be concatenated, (2)
single text file containing linebreak-separated paths of the files to be concatenated, or (3) multiple
filepaths of the files to be concatenated.
-o output_path, --output_path output_path
The output filename for the merged output.
```
For example:
```shell
poetry run bumblebee /path/to/netcdf/directory/ /path/to/output.nc
poetry run stitchee /path/to/netcdf/directory/ /path/to/output.nc
```
## Usage (without poetry)
Expand Down
2 changes: 1 addition & 1 deletion concatenator/concat_with_nco.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
import netCDF4 as nc # type: ignore
from nco import Nco # type: ignore

from concatenator.bumblebee import _validate_workable_files
from concatenator.stitchee import _validate_workable_files

default_logger = getLogger(__name__)

Expand Down
2 changes: 1 addition & 1 deletion concatenator/concat_with_nco_cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
import sys

from concatenator.concat_with_nco import concat_netcdf_files
from concatenator.run_bumblebee import parse_args
from concatenator.run_stitchee import parse_args


def run_nco_concat(args: list) -> None:
Expand Down
20 changes: 10 additions & 10 deletions concatenator/run_bumblebee.py → concatenator/run_stitchee.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,8 @@
from pathlib import Path
from typing import Tuple, Union

from concatenator.bumblebee import bumblebee
from concatenator.file_ops import add_label_to_path
from concatenator.stitchee import stitchee


def parse_args(args: list) -> Tuple[list[str], str, str, bool, Union[str, None]]:
Expand All @@ -21,7 +21,7 @@ def parse_args(args: list) -> Tuple[list[str], str, str, bool, Union[str, None]]
tuple
"""
parser = ArgumentParser(
prog='bumblebee',
prog='stitchee',
description='Run the along-existing-dimension concatenator.')

# Required arguments
Expand Down Expand Up @@ -132,19 +132,19 @@ def _get_list_of_filepaths_from_dir(data_dir: Path):
return input_files


def run_bumblebee(args: list) -> None:
def run_stitchee(args: list) -> None:
"""
Parse arguments and run subsetter on the specified input file
"""
input_files, output_path, concat_dim, keep_tmp_files, temporary_dir_to_remove = parse_args(args)
num_inputs = len(input_files)

logging.info('Executing bumblebee concatenation on %d files...', num_inputs)
bumblebee(input_files, output_path,
write_tmp_flat_concatenated=keep_tmp_files,
keep_tmp_files=keep_tmp_files,
concat_dim=concat_dim)
logging.info('BUMBLEBEE complete. Result in %s', output_path)
logging.info('Executing stitchee concatenation on %d files...', num_inputs)
stitchee(input_files, output_path,
write_tmp_flat_concatenated=keep_tmp_files,
keep_tmp_files=keep_tmp_files,
concat_dim=concat_dim)
logging.info('STITCHEE complete. Result in %s', output_path)

if not keep_tmp_files and temporary_dir_to_remove:
shutil.rmtree(temporary_dir_to_remove)
Expand All @@ -157,7 +157,7 @@ def main() -> None:
format='[%(asctime)s] {%(filename)s:%(lineno)d} %(levelname)s - %(message)s',
level=logging.DEBUG
)
run_bumblebee(sys.argv[1:])
run_stitchee(sys.argv[1:])


if __name__ == '__main__':
Expand Down
12 changes: 6 additions & 6 deletions concatenator/bumblebee.py → concatenator/stitchee.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,12 +18,12 @@
default_logger = logging.getLogger(__name__)


def bumblebee(files_to_concat: list[str],
output_file: str,
write_tmp_flat_concatenated: bool = False,
keep_tmp_files: bool = True,
concat_dim: str = "",
logger: Logger = default_logger) -> str:
def stitchee(files_to_concat: list[str],
output_file: str,
write_tmp_flat_concatenated: bool = False,
keep_tmp_files: bool = True,
concat_dim: str = "",
logger: Logger = default_logger) -> str:
"""Concatenate netCDF data files along an existing dimension.
Parameters
Expand Down
4 changes: 2 additions & 2 deletions entry.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
import logging
import sys

from concatenator.run_bumblebee import run_bumblebee
from concatenator.run_stitchee import run_stitchee


def main() -> None:
Expand All @@ -12,7 +12,7 @@ def main() -> None:
format='[%(asctime)s] {%(filename)s:%(lineno)d} %(levelname)s - %(message)s',
level=logging.DEBUG
)
run_bumblebee(sys.argv[1:])
run_stitchee(sys.argv[1:])


if __name__ == '__main__':
Expand Down
Loading

0 comments on commit 9c607c7

Please sign in to comment.