This repo contains terraform configurations to deploy the data.gov sandbox environment.
Note: production and staging environments are hosted in BSP and are not provisioned with terraform.
This environment attempts to keep parity with the BSP environments. The purpose is to act as a continuous integration environment to test Ansible playbooks live in a multi-host environment.
- Terraform 1.0.6+
Create the s3 bucket (datagov-terraform-state
) to hold the terraform state defined
in main.tf.
- Configure AWS Access Key
- jq
- awscli
- terraform v1.0.6+
These tools are available through your package manager, or through pip.
All developers are in the developers
IAM group which enforces access through
multi-factor authentication (MFA). You must first get temporary credentials to
use with Terraform.
First, copy env.sample
to .env
, customize it with your AWS access key.
AWS_MFA_DEVICE_ARN
should be set with your MFA arn. This can be found on the
"My Security Credentials" page in the AWS console. Then source these environment
variables.
$ source .env
You'll be prompted for your MFA code. Enter it without any spaces when prompted.
These credentials are good for 12 hours.
When making changes to the datagov-infrastructure-modules, you can either point the module source to a branch or use a local path. e.g.
module "solr" {
source = "github.com/gsa/datagov-infrastructure-modules.git//modules/solr?ref=feature-terraform-12"
# ...
}
Becomes:
module "solr" {
source = "../../datagov-infrastructure-modules//modules/solr"
# ...
}
The initial provisioning requires SSH access to the jumpbox. Since the jumpbox playbooks have not been run, you must use the environment's root SSH key. Please see a team member for access.
Add the SSH key to your SSH agent.
ssh-add ~/.ssh/datagov-sandbox
You must setup ssh-agent forwarding. Add this snippet to your SSH config.
# ~/.ssh/config
Host *.datagov.us
ForwardAgent yes
Copy the env.sample
to .env
. Then you can populate these secrets from the
Terraform state file. These secrets will also exist in the Ansible vault.
$ terraform refresh
$ terraform output
Once the environment is provisioned with Terraform, you can connect to the jumpbox to apply Ansible playbooks just as we do in the BSP environments.
Forward your ssh agent so that you have access to the SSH key to connect to
other instances. Consider adding this to your ~/.ssh/config
.
Host jump.sandbox.datagov.us
User <yourusername>
ForwardAgent yes
IdentityFile ~/.ssh/<your-datagov-deploy-key>
Connect to the jumpbox.
$ ssh $jumpbox_dns
The jumpbox dns is an output variable in the jumpbox module.
$ terraform output
When the jumpbox is first created, you'll need to bootstrap it to run ansible. You can copy/paste these scripts into your terminal. All commands should be run from the jumpbox.
First, install pyenv.
sudo apt-get update; sudo apt-get install --no-install-recommends make build-essential libssl-dev zlib1g-dev libbz2-dev libreadline-dev libsqlite3-dev wget curl llvm libncurses5-dev xz-utils tk-dev libxml2-dev libxmlsec1-dev libffi-dev liblzma-dev python3-pip
git clone https://github.com/pyenv/pyenv.git ~/.pyenv
echo 'export PATH="$HOME/.local/bin:$PATH"' >> ~/.bashrc
echo 'export PYENV_ROOT="$HOME/.pyenv"' >> ~/.bashrc
echo 'export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.bashrc
echo -e 'if command -v pyenv 1>/dev/null 2>&1; then\n eval "$(pyenv init -)"\nfi' >> ~/.bashrc
source ~/.bashrc
Setup SSH.
cat <<EOF > ~/.ssh/config
StrictHostKeyChecking=no
Host *.datagov.us
User ubuntu
IdentityFile ~/.ssh/authorized_keys
EOF
Then setup datagov-deploy.
git clone https://github.com/GSA/datagov-deploy.git
cd datagov-deploy
pip3 install --user pipenv
pyenv install
pipenv sync
pipenv run make vendor
Symlink the inventory to avoid having to specify it with ansible.
sudo mkdir /etc/ansible
sudo ln -s /home/ubuntu/datagov-deploy/ansible/inventories/sandbox /etc/ansible/hosts
We use GitHub Actions for continuous integration and delivery. As with all of
our code repositories, changes to the main
branch is automatically deployed.
As part of CI, the terraform plan will be posted to the PR as a comment. The plan represents the actions terraform will take once approved. It is both the author and reviewer's responsibility to review the plan in addition to the code changes.
You must configure GH with secrets in order to apply the terraform files.
- AWS IAM credentials of the deploy user (see GSA/datagov-iam)
- Application secrets to set (e.g. database passwords)
- Root ssh keys in order to provision through the jumpbox
First, set these environment variables in GH using the credentials from the deploy user (see GSA/datagov-iam):
AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY
Next, set any TF_VAR_*
environment variables in
GH
from your .env
. Reach out to a team member if you are missing any or pull them
from the terraform state (terraform output
).
Finally, add the root ssh
key
(datagov-sandbox) as the SSH_DATAGOV_SANDBOX
GH secret.
Modules in modules/
are true Terraform modules and encapsulate configuration
for a Data.gov component.
Instances must be manually added to the static sandbox hosts file. This gives us full control to assign hosts to Ansible groups within GSA/datagov-deploy without having to make changes within datagov-infrastructure-live. For example:
[solr]
datagov-solr1tf.internal.sandbox.datagov.us
[harvester]
catalog-harvester1tf.internal.sandbox.datagov.us
Tests include light terraform syntax validation. Don't forget to run the tests.
$ make test
You might also want to standardize the syntax in your files.
$ terraform fmt