From 59f0c680e14ce900bbb8944602fdaf19280ed1bc Mon Sep 17 00:00:00 2001 From: Dan Delany Date: Mon, 16 Dec 2024 14:19:52 -0800 Subject: [PATCH] Add details to Deployment page + new Production Deployment page with guide (#203) * rewrite Deployment page to add detail, add new Production Deployment page with guide * fix links, small improvements --- docs/deployment/introduction.md | 53 +++++++++++---- docs/deployment/production-deployment.md | 82 ++++++++++++++++++++++++ sidebars.js | 1 + 3 files changed, 123 insertions(+), 13 deletions(-) create mode 100644 docs/deployment/production-deployment.md diff --git a/docs/deployment/introduction.md b/docs/deployment/introduction.md index f2fc285..bdf744e 100644 --- a/docs/deployment/introduction.md +++ b/docs/deployment/introduction.md @@ -1,13 +1,49 @@ # Deployment -If you are in a hurry and want to get Aerie running locally quickly, please see the [fast track](/introduction/#fast-track) deployment instructions. This document goes into more depth about the Aerie system and how it should be deployed. +There are a few different ways to deploy Aerie: + +- To get Aerie running **quickly** on your computer, see the [fast track instructions](/introduction/#fast-track) for minimal setup. +- If you plan to deploy Aerie in a shared **production environment**, read this entire page and then see the [production deployment guide](/deployment/production-deployment). +- If you are a **developer** and you want to run Aerie locally & make changes to Aerie core code, read this page and then head to the [developer guide](https://github.com/NASA-AMMOS/aerie/blob/develop/docs/DEVELOPER.md) in the repository for local setup instructions. + +The rest of this document goes into more depth about the Aerie system and how it should be deployed, regardless of environment. + +## Aerie Releases + +Aerie releases are published on the [Github Releases page](https://github.com/NASA-AMMOS/aerie/releases), and each release has a `Deployment.zip` artifact attached. This folder contains everything necessary to deploy a version of Aerie - namely the **`docker-compose.yml`** and **`.env`** files, detailed below. These files are provided *as a starting point* and should be modified to suit your needs. + +## Environment Variables + +Each Aerie service is configured with environment variables, some of which are **required** to run. They are expected to be set in a `.env` file in the folder you're running Aerie from. The version of this file provided in `Deployment.zip` is an empty template that must be filled in with service usernames and passwords of your choosing. See [this .env.template file](https://github.com/NASA-AMMOS/aerie-mission-model-template/blob/main/.env.template) for a completed example. + +A description of allowed variables is found in the [Environment Variable Documentation](https://github.com/NASA-AMMOS/aerie/blob/develop/deployment/Environment.md) - it's recommended to read through these & determine which are relevant to your situation. + +Of note, the `aerie-merlin`, `aerie_merlin_worker`, `aerie-scheduler`, and `aerie-scheduler-worker` containers can be provided additional JVM arguments - for example, allocated heap size - as environment variables. Desired JVM flags should be added to the `JAVA_OPTS` environment variable for the container being configured. ## Docker -Aerie uses [Docker](https://www.docker.com/) as it's main deployment infrastructure. The artifacts used to deploy Aerie are a collection of [OCI](https://opencontainers.org/) images stored on [GitHub Packages](https://github.com/orgs/NASA-AMMOS/packages?ecosystem=container&q=aerie). Here is the list of required images, their description, default port, and if they should be public (exposed to the open network): +Aerie consists of multiple **services**, and uses [Docker](https://www.docker.com/) and [Docker Compose](https://docs.docker.com/compose/) to manage and run them. The artifacts used to deploy Aerie are a collection of Docker **images**, one per service, which we publish to the public [GitHub Packages](https://github.com/orgs/NASA-AMMOS/packages?ecosystem=container&q=aerie) repository. Aerie images conform to the [OCI](https://opencontainers.org/) [Image Format](https://github.com/opencontainers/image-spec/blob/main/spec.md) and may be compatible with Docker alternatives, but only Docker is officially supported. + +[Docker Compose](https://docs.docker.com/compose/) commands are used to build and run the Aerie services **all together**, so in general you should only need to run `docker compose up` & `docker compose down` (along with some various [command flags](https://docs.docker.com/reference/cli/docker/compose/)) to start and stop Aerie. + +### `docker-compose.yml` + +Docker Compose uses a configuration file called **`docker-compose.yml`** to control all sorts of options for the Aerie services. The compose file provided in `Deployment.zip` should work as-is, but modifying this file is one of your most useful tools for controlling deployment-specific Aerie configuration options. Options in this file control: + +- The source & version (tag) of the image used for each service (in the `image` field) +- The network ports used by each service (in `ports`) +- The directories used as mounted file volumes (in `volumes`) +- and other various environment variables passed to each service (in `environment`) + +A full list of possible options can be found in the [Docker compose file reference](https://docs.docker.com/reference/compose-file/). + +### Aerie services & images + +The following is a list of all of the required Aerie services, their associated Docker images (to be run by Compose), and their default network ports. The `ui`, `gateway` and `hasura` services are all "public-facing", which means their ports must be exposed to the network when running in a shared/production environment. | Image | Description | Port | Public | | ------------------------------------------ | --------------------------------------------------------------- | ----- | ------ | +| [aerie-ui][ui] | The web-based client application for Aerie. | 80 | ✅ | | [aerie-gateway][gateway] | Gateway server used for file-upload and authentication. | 9000 | ✅ | | [aerie-hasura][hasura] | Hasura Docker image with bundled Aerie-specific Hasura metadata | 8080 | ✅ | | [aerie-merlin][merlin] | Service for planning and simulation | 27183 | ❌ | @@ -16,17 +52,7 @@ Aerie uses [Docker](https://www.docker.com/) as it's main deployment infrastruct | [aerie-scheduler][scheduler] | Service for scheduling | 27185 | ❌ | | [aerie-scheduler-worker][scheduler-worker] | Worker for executing scheduling goals | 27189 | ❌ | | [aerie-sequencing][sequencing] | Service for sequence generation and management | 27184 | ❌ | -| [aerie-ui][ui] | The web-based client application for Aerie. | 80 | ✅ | -You can launch Aerie via [Docker Compose](https://docs.docker.com/compose/) using our template [docker-compose.yml](https://github.com/NASA-AMMOS/aerie-mission-model-template/blob/main/docker-compose.yml) file. - -If you need a more custom deployment you can use the Aerie [deployment directory](https://github.com/NASA-AMMOS/aerie/blob/develop/deployment), which we include with each [release](https://github.com/NASA-AMMOS/aerie/releases). For example if you want to run Hasura and Postgres outside of a Docker container (recommended for larger deployments), the deployment `.zip` file included in the release contains all the Hasura metadata and `.sql` files needed to spin up those services on their own. - -## Environment Variables - -Each Aerie service is configured with environment variables. A description of those variables is found in the [Environment Variable Documentation](https://github.com/NASA-AMMOS/aerie/blob/develop/deployment/Environment.md). - -Of note, the `aerie-merlin`, `aerie_merlin_worker`, `aerie-scheduler`, and `aerie-scheduler-worker` containers can be provided additional JVM arguments as environment variables. For example one may choose to configure the JVM allocated heap size. On must provide any desired JVM flags to the `JAVA_OPTS` environment variable for the container being configured. ## System Requirements @@ -57,7 +83,8 @@ Note these numbers are lower bounds. You will need to scale Aerie based on your ## Defect Reporting Procedure -All defect reports should go to `aerie_support@jpl.nasa.gov`. +Defect reports should be sent to: `aerie-support@googlegroups.com`. For chat-based support, please join us on the [NASA-AMMOS Slack](https://join.slack.com/t/nasa-ammos/shared_invite/zt-1mlgmk5c2-MgqVSyKzVRUWrXy87FNqPw), in the `#aerie-users` channel. + [gateway]: https://github.com/orgs/NASA-AMMOS/packages/container/package/aerie-gateway [hasura]: https://github.com/orgs/NASA-AMMOS/packages/container/package/aerie-hasura diff --git a/docs/deployment/production-deployment.md b/docs/deployment/production-deployment.md new file mode 100644 index 0000000..03b92af --- /dev/null +++ b/docs/deployment/production-deployment.md @@ -0,0 +1,82 @@ +# Production Deployment + +This document describes some of the things you'll need to consider when deploying Aerie to a shared production server, and provides a guide for configuring and running Aerie in your environment. + +:::danger + +Aerie allows execution of user-provided code in the simulation and scheduling environments, so it is important to protect your environment from being accessed by anonymous Internet users. The safest way to deploy Aerie is with *network-level access controls*, limiting access to eg. only IP addresses on your local network or VPN. You may want to also use an [authentication adapter](/deployment/advanced-sso/) to implement user-level access controls. Do not run Aerie on the public internet without one or both of these controls in place! + +::: + +## Infrastructure details + +Before deployment, consider how you plan to handle these details about your infrastructure: + +* **Domain name** - if you are using a domain name for your Aerie services, make sure you configure the DNS to point to your server's IP address, whether internal or public. This name will also be used in your docker-compose configuration - see below. +* **HTTPS** - we recommend that Aerie deployments, **especially** those on public networks, use HTTPS with a valid TLS certificate. You can use a TLS certificate from an authority like [Let's Encrypt](https://letsencrypt.org/), or a self-signed certificate from your organization for deployments on a private network. +* **Reverse proxy/load balancer** - _if_ you are using HTTPS, you will need to configure an additional service to be the single ingress point for all (HTTPS) network traffic, strip TLS and forward HTTP traffic to the Aerie services. One way to accomplish this is with the **[reverse proxy pattern](/deployment/advanced-reverse-proxy/)**, for which an example is provided. If you are using AWS infrastructure, you may find it easier to manage your certificate with [Amazon Certificate Manager](https://docs.aws.amazon.com/acm/latest/userguide/acm-overview.html) and route all traffic through an [Application Load Balancer](https://docs.aws.amazon.com/elasticloadbalancing/latest/application/introduction.html) instead of using a reverse proxy. +* **Authentication/SSO** - Aerie **does not include** its own fully-featured implementation of user authentication, and should either be used with an **"authentication adapter,"** or in a private network with limited access. We provide an adapter for use with JPL CAM, a JPL-internal SSO solution - [instructions here](/deployment/advanced-sso/). Users needing to integrate with other SSO/auth systems may need to write their own auth adapters following the same pattern. + + +## Production Deployment Guide + +This checklist outlines a set of steps for running a minimal production Aerie deployment. Note that these will vary somewhat from one environment to another, so feel free to adapt them as needed. + +1. Ensure you have **Docker Engine and Docker Compose installed** on your server. We recommend following the official [Engine install guide](https://docs.docker.com/engine/install/) and [Compose install guide](https://docs.docker.com/compose/install/#scenario-two-install-the-docker-compose-plugin) for your platform, as simply running eg. `yum install docker` may install Podman instead of Docker on some platforms. +2. Ensure your server has the necessary **network ports exposed** for Aerie services - namely, ports **80, 8080, and 9000**, unless you plan to modify these default ports in the docker-compose file. Port rules are usually configured via your server's firewall settings. See [Aerie services & images](/deployment/introduction/#aerie-services--images) for details on services and their port assignments. If you are running on an AWS EC2 instance, you may need to set rules for the instance's *security group* to allow these ports to send & receive TCP traffic. +3. Copy the `Deployment.zip` file from an [Aerie release](https://github.com/NASA-AMMOS/aerie/releases) to your server and extract it, for example: + ``` + curl -sLO https://github.com/NASA-AMMOS/aerie/releases/download/v3.1.1/Deployment.zip + unzip Deployment.zip + tar -xf deployment.tar + ``` +4. Modify the `.env` file to fill in the required variables - see [Environment Variables](/deployment/introduction/#environment-variables). Importantly, all services need usernames and passwords set, and Hasura needs a secret key - see eg. [this completed example](https://github.com/NASA-AMMOS/aerie-mission-model-template/blob/main/.env.template). +5. Modify the [`docker-compose.yml` file](/deployment/introduction/#docker-composeyml) for your specific environment. + - A useful pattern is to **[merge Compose files](https://docs.docker.com/compose/how-tos/multiple-compose-files/merge/)** when running Aerie, to keep your custom compose file changes separate from the original file provided by the deployment, rather than modifying the original. You can create a file called eg. `docker-compose.prod.yml` which contains *only* the overriding changes you want to make to the original file. Then, when running your services, you can pass them both to Compose [with the `-f` flag](https://docs.docker.com/compose/how-tos/multiple-compose-files/merge/). + - If you are using [CAM/SSO authentication adapters](/deployment/advanced-authentication/) and/or [a reverse proxy](/deployment/advanced-reverse-proxy/), review their docs to determine the Compose file modifications they require. + - Regardless of your other settings, your compose file needs to provide the `aerie-ui` service with the **fully-qualified domain names** (FQDNs) it will use to make requests to the other services. This is generally done by adding an additional variable to the `.env` file with your base domain, eg.: + ``` + AERIE_HOST="myaerie.myorg.com" + ``` + and then adding the following lines to your `docker-compose.prod.yml` file: + ``` + aerie_ui: + environment: + ORIGIN: https://${AERIE_HOST} + PUBLIC_GATEWAY_CLIENT_URL: https://${AERIE_HOST}:9000 + PUBLIC_HASURA_CLIENT_URL: https://${AERIE_HOST}:8080/v1/graphql + PUBLIC_HASURA_WEB_SOCKET_URL: wss://${AERIE_HOST}:8080/v1/graphql + ``` + If you do not have a proper domain name set up yet, you can use your server's IP address or any other FQDN you have available. +6. Finally, use the `up` command to run all of the Aerie services in their docker containers, eg.: + ``` + docker compose -f docker-compose.yml -f docker-compose.prod.yml up -d + ``` + After a few seconds, you can check on the status of the services by running `docker ps`. It may take a minute or two to fully initialize the system, but eventually all services should show an "Up" status. If not, check your Docker logs for errors from the services (see [Logging](#logging) below). + +## Other Considerations + +### Data persistence and backups +It's a good idea to have a strategy for backing up and restoring your Aerie data in case something goes wrong with your server. Aerie mainly persists data in two ways: +* The Postgres database, managed by the `aerie-postgres` container, stores the majority of user-created data such as plans and simulation runs. +* Some data is also stored in the filesystem of the Aerie container, such as uploaded mission model JAR files. + +Both types of data are persisted using [Docker Volumes](https://docs.docker.com/engine/storage/volumes/). One way to handle backups is simply to copy all data out of the Aerie volumes (to another location, off of your main instance) on a eg. nightly basis. + +It may be useful sometimes to just backup the state of your Postgres database using a utility like `pg_dump` eg. before performing complex database operations - just remember this does not backup the files in the filesystem and therefore is only useful for recovering from database issues. + +### Upgrading your Aerie Environment + +Aerie releases new versions roughly every two weeks, and eventually you may want to upgrade your environment to a new version. If you want to preserve your environment's data from the previous version, you should take care to upgrade and migrate your data forward in a safe way: +* Carefully read the [changelogs on the Releases page](https://github.com/NASA-AMMOS/aerie/releases) and the [upgrade guides](https://nasa-ammos.github.io/aerie-docs/upgrade-guides/3-1-1-to-3-2-0/) for all versions between your old version and the one you're upgrading to, and keep note of any breaking changes. +* Perform a backup of your database and/or Docker volumes before upgrading. +* To perform the upgrade: + - Stop your docker containers with `docker compose down` + - Update the `DOCKER_TAG` environment variable to the new desired Aerie version + - If necessary, modify your docker-compose or any other deployment options to deal with breaking changes + - Bring your docker containers back up with `docker compose up` + - Follow the instructions on the [Database Migrations page](/deployment/advanced-database-migrations/) to run the migration script, which will automatically migrate your data to be compatible with the new version. + +### Logging + +By default, Docker saves logs on the local filesystem for all of your Aerie services, and [displays them with the command `docker logs `](https://docs.docker.com/reference/cli/docker/container/logs/). However, these logs are somewhat ephemeral and may be overwritten in time. If you care about retroactively investigating and debugging issues encountered by your users, it's a good idea to have a log rotation strategy and to save your logs in a more permanent archive, outside of your environment. diff --git a/sidebars.js b/sidebars.js index df28d5b..d094dd2 100644 --- a/sidebars.js +++ b/sidebars.js @@ -99,6 +99,7 @@ const sidebars = { type: 'link', href: '/introduction/#fast-track', }, + 'deployment/production-deployment', 'deployment/advanced-ui-custom-base-path', 'deployment/advanced-kubernetes', 'deployment/advanced-database-migrations',