Skip to content

Latest commit

 

History

History
138 lines (87 loc) · 7.99 KB

CONFIGURATION.md

File metadata and controls

138 lines (87 loc) · 7.99 KB

Configuration Design Guide

Overview

All of a helix project configuration is contained in a single helix-config.yaml file. The YAML format is moderately more readable than JSON and almost as widely accepted. YAML benefits from features like comments and references, which are missing in JSON, .

For more stability of the config, the canonical structure of is validated with JSON Schema. This way errors are easily detected. The schemas also allow for a formal way of describing and documenting the configuration format.

TODO:

  • tooling creates canonical effective-helix-config.yaml which contains comments for the source of all values
  • the effective file will/must not be checked into git

Project Layout

  • templates and pre functions are located in ./src
  • client side javascript must not go into ./src. (TODO: we will provide a react example, and put the sources in ./react)
  • all static files go to htdocs
  • all additional cgi-like openwhisk actions will go to htdocs/cgi-bin

Strains are the only top-level concept

This means that there are no concepts that are orthogonal to strains, but that every other configuration object can be assigned to one or more strains, and that every request can be assigned to one specific strain.

There will be no reconciliation of strains and some other concept at runtime. Every .hlx/strains.json will contain a fully deterministic, fully resolved and denormalized representation of all strains.

Complex configuration objects can be defined in a definitions container and re-used

In order to make writing configurations easier and to enable re-use of configuration settings, a definitions container at the top level of the helix-config.yaml is allowed that can hold re-usable definitions like this:

definitions:
  strains:
    base: &basestrain
      code: https://github.com/adobe/project-helix.io.git

  origins:
    publish: &publish
      address: 192.168.0.1
      name: publish
      use_ssl: false

Those definitions can the be reused using YAML references (*publish) or YAML extensions (<< *basestrain). This allows to rely on the behavior of the YAML parser instead of using home-grown inheritance and referencing logic.

A strain re-using the *basestrain above would look like this:

  - name: client
    <<: *basestrain
    condition: req.http.host == "client.project-helix.io"
    content:
      repo: helix-cli
      ref: master
      owner: adobe
    directoryIndex: readme.html

A strain referencing the *publish origin would look like this:

  - name: proxy-detailed
    sticky: true
    origin: *publish

Mandatory default strain.

The default strain serves all traffic in production when no other strain is selected, therefore a default strain is mandatory. This is enforced by the JSON schema. Since all inheritance becomes explicit by the use of YAML references, the default strain is special only in regards to serving traffic, but does not alter the behavior of any other strain.

The code and package properties.

The code property of a strain defines which code repository the strain belongs to. This is useful to allocate strains to different environments, eg. testing, staging, production etc.

The package properties are usually only modified by the tooling and rarely need to be altered manually. The package property records which runtime action package is to be used to handle the requests of this strain.

The url property

The url property for a strain is currently used as a shorthand for a strain matching condition. it also sets the base url which is used for path rewriting.

TODO: the url property might be replaced by the condition object

Tooling Notes

hlx deploy

  • In order to increase the visibility of changes happening during deployment, hlx deploy list all strain names that will be affected by the deployment.

  • hlx deploy gets the current git-remote git remote get-url origin as $CURRENT_CODE_REPO and checks all strains for a code property that matches the $CURRENT_CODE_REPO.

  • If no strains are affected, hlx deploy will print a new strain config to stdout that points to the new code location ($CURRENT_CODE_REPO), copies all other values from default except for url or condition. The new strain will have an auto-generated, hard-to-guess name, so that it cannot unwittingly be accessed.

  • When running hlx deploy --add=foo the new strain will be added to the configuration file automatically and hlx deploy will instead show instructions on accessing the strain.

  • When running hlx deploy --add=default the default strain will be created or updated.

  • hlx deploy updates the package property of all affected strains with the SHA of the current branch. It will append a -dirty accordingly if the current checkout is not clean.

Notes:

  • Q: why is it important to have a random name? why not using the branch-name ?
  • A: Two reasons:
    • to avoid conflicts
    • to prevent people from forging the X-Strain cookie and getting access to development- or staging-only strains.
  • A deployment that does not affect any strains will have a non-zero exit code, so that it can fail in CI.

hlx publish

  • hlx publish will only update the strains that contain a package property.

hlx up

During local development, the simulator behaves similar to the edge, and selects the strain based on the url property (later on the condition). In order to simulator a specific domain for testing, hlx up accepts a --host argument, which overrides the request.host header in the simulator.

hlx up also provides a usable default configuration, in case a helix-config.yaml is missing. This config can be persisted using the --save-config argument.

Open Discussions

Parallel Deployments in a CI environment

Concurrent deployments from a CI environment pose a hard problem at the moment:

  1. if the strain configuration cannot be modified by the deployment action, a new strain must be introduced for every new branch, prior to the creation of the branch, so that the branch can be deployed, published and tested
  2. if the strain configuration can be modified by the deployment action, a new strain can be introduced, but as the helix-config.yaml is the single point of synchronization, publishing in one branch will automatically deactivate all strains that refer to concurrent deployments made in different branches. In active development, this will lead to race conditions and intermittent test failures because the strain you just published got "un-published" by a build triggered in another branch.
  3. even within one branch, making two commits in short succession will lead to the second commit tainting the deployment of the first build, with potentially misleading results. For instance a bug fix might get attributed to the first commit because this is where testing first succeeds, although the fix was deployed in the second commit

Proposal: Temporary Strains

For testing a deployment in a Continuous Integration environment, it can be useful to have strains that are not persisted in the helix-config.yaml, but can still be activated for testing.

To enable temporary strains, the strain resolution logic in VCL will be modified so that when an X-Strain cookie or header is present and the value of the header contains a /, both the X-Strain (name) and X-Tag (tag) will be parsed. For the most part, the default logic of the current X-Strain will be applied, so that the strain's directory index, static repo, etc. will be used. The only exception is the resolution of the OpenWhisk action to execute. Here, the -git--([\w]+)-- pattern will be replaced with the X-Tag value, effectively pinning the used action to the tagged deployment.

As this will lead to a large number of temporary actions in OpenWhisk, we also introduce a hlx undeploy command that clears a temporary deployment made earlier. This can be run at the end of every CI job.

hlx test and hlx perf should use the new temporary strains when running in CI.