Currently, we use Github Actions to run our continuous integration & builds.
Basic concepts:
- A step is the smallest unit of work in Github Actions. It can be as simple as running a shell command.
- A step is versatile because it can also invoke external community-contributed functions ("actions"). Actions can be used for simple operations like checking out the latest code in the repository, or slightly more complicated tasks like running a benchmarking tool and formatting the results. Actions are usually kept small in scope. They are implemented as repositories on Github whose contents implement interfaces defined by Github Actions.
- A job runs multiple steps serially.
- A job groups together related steps that achieve a larger task (ex: run unit tests)
- Each job instance is run in a completely new VM instance ("runner").
- A workflow runs multiple jobs in parallel.
- Workflows are run by Github Actions any time certain trigger events occur. Each workflow configures what trigger events conditions must be met for it to to be invoked (ex: only for Pull Requests; only on merges to the main branch)
- Jobs within a workflow can be made to run in a particular order by specifying job dependencies. This creates the directed acyclic graph (DAG) relationship of job execution.
-
This intro page shows the following
- A Github Action workflow is written in YAML syntax, and must be placed in
.github/workflows
. - A separate tab within the Github UI for a repository shows the history of all Actions runs for the repository.
- In the UI, for each workflow run instance, you can see each job of the workflow, and each step within each job. For a job, you can see the console output (stdout, stderr) for each step with linkable line numbers and a search bar for searching the logs.
- A Github Action workflow is written in YAML syntax, and must be placed in
-
This example from an old, experimental, preliminary ICU4X workflow shows a really small task executed in a naive way:
- The
on
key config shows that the workflow is triggered only when a commit is made onmain
. Since the repository only allows commits tomain
through the Pull Request process, this workflow is unhelpful for Pull Request authors and reviewers because pull requests originating from any branch not calledmain
(all of them) will not trigger the workflow. - There is one job, which is to build and run unit tests.
- The first step of any job is usually to checkout the latest code in the repository. Otherwise, the fresh VM runner has an empty filesystem.
- The first step of this job invokes
actions/checkout@v2
. These action coordinates indicate: invoke the Github Action kept in the repository https://www.github.com/actions/checkout, and use the state of that repository that thev2
ref points to. Sincev2
is a git "ref", that means could be a git tag, a git branch name, or a git commit hash value.- The preferred convention by Github Actions is that action repository maintainers use
v1
andv2
as tags that point to the commits that are tagged with the latest version within that major version. (Ex:v1
points tov1.2.3
,v2
points tov2.8.9.1
). However, be aware that not all actions adhere to this guideline, and instead only havevX.Y.Z
tags without avX
tag to the latestvX.*.*
. As an example, this problem was observed and worked around in theunicode-org/cldr
repo.
- The preferred convention by Github Actions is that action repository maintainers use
- For this simplistic example, the way to improve it in order to allow a Pull Request to run the same checks after each new commit is pushed, in the same way that checks are run on merges to
main
, is to changeon.pull_request.branches
to match all origin branch names.
- The
One nice aspect of Github Actions' integration in Github is that if there is a workflow that is triggered to run on pull requests, and if a pull request includes some modification to that workflow, then the pull request will be run using the pending new changes to the workflow. For example, if an existing workflow (ex: .github/workflows/build-test.yml
) is configured to run 5 benchmarks, and a pull request is made to add a 6th benchmark to that workflow, then the pull request will run 6 benchmarks, not 5.
This means that most changes to a workflow can be tested in the PR that introduces the changes. The effects are visible to the PR author and reviewers alike.
There might be reasons why you want or need to test changes to Github Actions on your personal fork. Less frequently, there might be specific changes that cannot be tested via a PR because they only happen on merges to the main branch main
. For example, API docs changes and benchmark dashboard changes should only occur on merges to main. In this case, you can use your personal fork of the upstream repo as a testing ground. The naive approach is not recommended -- to directly modify your main
branch -- because it requires awkwardly changing your git repo during and after testing in ways that are often confusing and opposite to "git-flow" habits.
So here are steps to test Github Actions change on your personal fork, with links to an example:
- Create a new testing-only branch that tacks on an extra commit to make testing-appropriate changes.
- Push the testing branch to the personal fork of
icu4x
and create a testing PR targeting personal fork's main. - GitHub Actions will run a new job because the testing PR satisfies the execution trigger conditions.
- Optional extra step - merge the testing PR to get Github Actions execute the job that copies GH pages over to the fork of the docs repo.
Also, part of your testing-appropriate changes can be as primitive-yet-sufficient as adding lots of stdout debugging that gets captured in the logs (example).
- A "job matrix" allows you to run a collection of parameterized jobs, where you indicate which parts of the job configuration are the parameters (variables) and what values they are allowed to take. If the parameter is the OS of the VM, then you can run the same job on Linux, macOS, and Windows with little extra work
- The parameters are defined as fields under the
strategy.matrix.<param>
key within the job, and the range of allowed values are stored as arrays. The parameters are used (string interpolated) with${{ matrix.<param> }}
syntax. Example:
Here,jobs: test: strategy: fail-fast: false matrix: os: [ ubuntu-latest, macos-latest, windows-latest ] runs-on: ${{ matrix.os }} steps: - uses: actions/checkout@v2 - ...
os
is a parameter defined understrategy.matrix.os
for thetest
job's job matrix.os
takes on all values in the range defined by the array[ ubuntu-latest, macos-latest, windows-latest ]
. Every time thetest
job is run, it is run 3 times, once per possible value.- A job matrix can help decrease wall clock time for multiple independent long-running steps, like benchmarks. Example:
Here,jobs: benchmark: strategy: fail-fast: false matrix: component: - components/locid - components/uniset - components/plurals - components/datetime - utils/fixed_decimal runs-on: ubuntu-latest steps: - uses: actions/checkout@v2 - name: Run benchmark run: | pushd $PWD && cd ${{ matrix.component }}; cargo bench -- --output-format bencher | tee $OUTPUT_PATH/output.txt; popd - ...
component
is a parameter defined understrategy.matrix.component
for thebenchmark
job's job matrix.component
takes on the values defined in the YAML array[ components/locid, components/uniset, components/plurals, components/datetime, utils/fixed_decimal]
- The parameters are defined as fields under the
- Conditional execution of steps and jobs - you can use the
if
key to control more granularly whether a step or job can run.- In this example, we want the workflow to trigger on all Pull Requests and successful merges to
main
. However, when we look more granularly at the jobs within the workflow, some jobs, like regenerating API docs or benchmark dashboards, make no sense on in-flight PRs and therefore should only execute when they're fully finished, reviewed, and merged tomain
. We add theif
key on the jobs to control the conditional execution in isolated instances that is more granular than the workflow-level triggers defined in theon
key.
- In this example, we want the workflow to trigger on all Pull Requests and successful merges to
- "Uploading / downloading artifacts" is a mechanism that Github Actions provides to allow a persistence of files from one job to another within a single workflow. This can be useful since each job VM runner is created fresh, and inherits no previous state.
- name: Upload updated benchmark data (merge to main only) if: github.event_name == 'push' && github.ref == 'refs/heads/main' uses: actions/upload-artifact@v2 with: path: ./benchmarks/perf/** # use wildcard pattern to preserve dir structure of uploaded files name: benchmark-perf
- name: Download previous content destined for GH pages uses: actions/download-artifact@v2 with: path: ./copy-to-ext-repo/benchmarks/perf name: benchmark-perf
- There is no mechanism to persist data storage across workflow instantiations. The only way to persist / store data across workflow instances is through making commits on a branch within the git repo itself. A benchmark dashboard requires the accumulation of data points computed from each invocation. Therefore, the benchmark action that we use relies on creating git commits on a branch to store this historical information over time.
Note: the *.yml
workflow files should should provide detailed documentation for the individual steps. These are only broad examples of what the GitHub Actions can do.
API docs, the WASM demo, and benchmarking graphs are uploaded to a Google Cloud Storage bucket (accessible by Git ref), and published to this repository's
GitHub Pages after each commit to main
.
We use the rhysd/github-action-benchmark. It does the following:
- For a handful of programming languages (including Rust), it runs the most common benchmark tool
- It converts the benchmark tool's output into a JSON file
- We upload the JSON file and the generated HTML dashboard to our GCS bucket
- The graphs then get published on GitHub Pages
Developers need an easy way to run the same commands and options locally as they are run in CI.
Currently, there is no official way to invoke a Github Actions workflow locally.
Since we run tests on all major OS types, we need a cross-platform solution to running commands.
Rust's cargo-makefile
is a plugin for the cargo
build tool that allows creation of a sequence of commands and options using an alias.
Using cargo-makefile
, the CI commands were rewritten to use the new aliases.
cargo-makefile
has built-in support for the inline inclusion of scripts written in Duckscript
Here is an example of an inline script in a cargo-makefile
task that ensures that all new source files have a specific license header.
Code coverage is configured in a separate workflow since it is completely independent of the operations for build, test, format/style-checking, linting/error-prone-checking, API docs, benchmarking. Two code coverage dashboard services exist:
- Codecov
- Coveralls
- config for Coveralls Both code coverage dashboard services automatically support most output file formats for most programming languages' code coverage tools. The Github Actions for both code coverage services are configured to automatically add a PR comment with the code coverage report for the PR's changes. The actions are smart enough to invalidate/close old comments and only leave the latest open comment for the latest state of code in the PR.
Each job within a workflow is run in a new, clean runner VM instance. However, within a single workflow, if multiple jobs depend on the state after a single compilation process, when that compilation takes a long time, then the jobs will incur redundant effort that adds to the total time. In cases where the compilation generates libraries / packages / artifacts that can be re-used in subsequent steps, these output artifacts can be cached. The cache lasts indefinitely, meaning that the artifacts are reused from job-to-job and workflow-instance-to-workflow-instance.
The Github Actions caching enables caching after specifying the basics: 1) what is to be cached (the path of files in the filesystem), 2) the specific key to register in the cache for those files. Specifying what is to be cache is build tool dependent, ex: .m2/**
for Java Maven builds, or target
for Rust Cargo builds. How you specify the key determines the uniqueness of the build and its dependencies.
Previously in ICU4X, caching was used for the Rust build in the parts of the cargo compiler's cache of repositories, index of known crates, and the compiled crate binaries. Differences in the build tool's main config file (Cargo.toml
) indicate a potential change in dependencies, so the file digest hash of that file is used in constructing the cache key associated with the dependency artifacts location. The version of Rust matters for compatibility between some dependencies, so it probably needs to be included in cache key string. (By contrast, for Java, the inclusion of the language version number in the cache key might be less important since Java byte code is generally forwards compatible with future versions of the language.)
Within a job, the cache step (the job step using the cache action actions/cache
) should be specified early. At the point in the job where it is declared, it will restore any files to the path specified if the key specified exists. Use of the cache action will create an extra step at the end of the job, if the rest of the job successfully completes. This extra last step will construct the cache key string, and only if the key is not present, will it store the files in the cache. Otherwise, the last step is a no-op.
Troubleshooting cache misconfiguration can be tricky, but the most likely error is that the cache key is not specific enough. When that happens, there can be an expectation that a new, different set of files will replace the old set of files under that key. However, unlike a map, there is a "write once" behavior, not an "overwrite" behavior, so there is a no-op even when the path contains a different set of files than usual. On the opposite end, if the key is too specific, then you risk creating duplicate copies of the same files, which can prevent the proper re-use of files that caching is meant to solve, if it doesn't already incur excess usage above the cache storage capacity first. If you need to change the semantic of a cache key, you can either: 1) append a "-v2" to the end in order to create a new, unused, unique string, or 2) stop using the key for 7 days, after which the key is evicted from the cache.