Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Starting a repo with basic python code best practices #1

Merged
merged 11 commits into from
Jul 19, 2024
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 31 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
name: Sample CI

on:
push:
workflow_dispatch:

jobs:
main:
runs-on: ubuntu-latest

steps:
- name: Checkout
uses: actions/checkout@v4

- name: Apt dependencies
shell: bash
run: |
sudo apt -y update
sudo apt install -y python3-dev

- name: Python dependencies
shell: bash
run: |
pip3 install numpy \
pytest

- name: Test
shell: bash
run: |
pytest -v .

61 changes: 60 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,61 @@
# py-template
Template for a new python project
This repository serves as a template for new python projects and a way to express best practices

## Best Practices - Coding Practices

### User input
* Always use [`argparse`](https://docs.python.org/3/library/argparse.html) for command-line arguments
* Assume the use of [`yaml`](https://python.land/data-processing/python-yaml) for structured input files unless there are compelling reasons for something else

### User output
* Always use `logger` for status output
* Carefully choose an output format for standard formats, considering the following in order of priority:
* [CSV](https://docs.python.org/3/library/csv.html) with direct support in NumPy, Pandas, etc
* [yaml](https://python.land/data-processing/python-yaml)
* [json](https://docs.python.org/3/library/json.html)
* [HDF5](https://www.h5py.org/)
* SQL, for example [SQLite](https://docs.python.org/3/library/sqlite3.html)

### Modularity
* All runnable scripts [should include a block like](https://stackoverflow.com/questions/419163/what-does-if-name-main-do):

``` python
if __name__ == __main__():
do_main_task()
```

### Testing
* Always use [`pytest`](https://docs.pytest.org/en/8.2.x/) for testing
* Introduce a simple continuous integration (CI) action ASAP

### Collaboration
* Generate pull requests (PRs) with as little code change as possible
* Include tests in all PRs
* Do not merge your own PR; there should always be at least one review by a
non-author, and a non-author should merge

### Packaging and installation
* Introduce a `pyproject.toml` file ASAP

## Best Practices - Style

### Code formatting
* Follow PEP8 style guide, ideally with a tools like
[`black`](https://pypi.org/project/black/) to help enforce it, especially via
a plugin to your editor

### Variable naming
* Generally, choose nouns for variables and verbs for methods
* Clear variable and method names can reduce the need for comments
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is mentioning snake case appropriate?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the list could grow - see our previous discussion about where to put this. I put a few examples, but let's discuss where to keep more details. This PR doesn't have to depend on it - we can make additions/changes later.


### Comments
* Include a docstring in every method
* Rely on clear variable and method names and add comments sparingly where the
intent/approach is non-intuitive

### Modularity
* If you have cut & paste code in two different places, it probably should be a
method
* Even very short methods can be valuable if the method name makes the code more
readable
* Ideally, methods should be no longer than one screen worth of lines
111 changes: 111 additions & 0 deletions do_task.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
import argparse
import logging
import yaml

logger = logging.getLogger(__name__)
gonuke marked this conversation as resolved.
Show resolved Hide resolved


def perform_action(args, input_data):
"""Perform some action

Following the principle of separation of concerns, this method only
performs the action, but does not have any input or output.

inputs
------
args : argparse object with many possible members based on argparse configuration
input_data : dictionary of input data read from YAML file

outputs
--------
string that describes the input quantities

"""

return f"Some results based on args:\n{args}\n and input_data:\n{input_data}\n"


def report_results(args, input_data, results):
"""Report the result of the action

Following the principle of separation of concerns, this method only
performs output and does not do any actions.

inputs
------
args : argparse object with many possible members based on argparse configuration
input_data : dictionary of input data read from YAML file
results : the results of the action

outputs
--------
None

"""

logger.info(
f"Based on the values of args:\n{args}\n and input_data:\n{input_data}\n"
+ "Here are the results:\n{results}\n"
)


def task_args():
"""Setup argparse for this script

Add an argparse.ArgumentParser with standard entries of:
`filename` : required positional argument
`verbose` : optional keyword argument

inputs
-------
None

outputs
--------
argparse object with various members depending on configuration
"""

parser = argparse.ArgumentParser(
prog="do-task",
description="A program to perform a certain task.",
epilog="Some text at the bottom of the help message",
)

parser.add_argument("filename", help="A filename to read for input")
parser.add_argument(
"--verbose", "-v", type=int, default=0, help="Level of verbosity for logging"
)

return parser.parse_args()


def read_input(input_filename):
"""Read input from yaml

Read input in YAML format from a specified filename.

inputs
-------
input_filename : a string with a filename/path accessible from the current location
"""

with open(input_filename, "r") as yaml_file:
input_data = yaml.safe_load(yaml_file)

return input_data


def do_task():
"""Main function to perform the actions for this script."""
args = task_args()

logging.basicConfig(level=(logging.WARN - args.verbose))

input_data = read_input(args.filename)

results = perform_action(args, input_data)
report_results(results)


if __name__ == "__main__":
do_task()
13 changes: 13 additions & 0 deletions test_do_task.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
import do_task

def test_perform_action():
args = 'foo'
input_data = 'bar'

exp = f"Some results based on args:\n{args}\n and input_data:\n{input_data}\n"

obs = do_task.perform_action(args, input_data)

assert exp == obs