This repo manages datahub.io infrastructure as code.
DataHub.io uses a microservices architecture. We use docker for containerisation and Kubernetes for orchestration.
This repo is focused on the orchestration uses Kubernetes. It assumes each service is responsible for its own dockerisation and publication to the container registry i.e. dockerhub.
If cluster does not yet exists follow docs to boot it up
If you want to boot a DataHub instance manually or just play around ...
- [Prerequisites] Install docker 😄
- Run a local docker environment with all the tools installed:
docker run -it --entrypoint bash -v `pwd`:/ops orihoch/sk8s-ops
- Authenticate with Google Cloud Platform
gcloud auth login
Note: most of the time this will all run automatically, here's how ...
- Every time a service builds it should update this repo (in some way - how? Ans: By Travis)
- The travis script here then gets triggered and automatically updates the relevant k8s clusters ... see travis.yml for details.
DevOps team: adding a new service, or updating configuration ...
- Using Linux / OSX? Use Docker
- Using Windows? Use Google Cloud Shell
- Start a bash shell with all required dependencies and the deploy code
docker run -it --entrypoint bash -e OPS_REPO_SLUG=datahq/deploy orihoch/sk8s-ops
- If you want to install locally, see these Dockerfiles: sk8s-ops cloud-sdk-docker
- Authenticate with Google Cloud Platform
gcloud auth login
- Clone and change directory to the deploy repo
git clone https://github.com/datahq/deploy.git
cd deploy
- Start a bash shell with all required dependencies and mounted volume to the host
deploy
codedocker run -it --entrypoint bash -v
pwd:/ops orihoch/sk8s-ops
- Authenticate with Google Cloud Platform
gcloud auth login
- Install Minikube according to the instructions in latest release notes
- Create the local minikube cluster
minikube start
- Verify you are connected to the cluster
kubectl get nodes
- Install helm client
- Initialize helm
helm init --history-max 2 --upgrade --wait
- Verify helm version on both client and server
helm version
- should be v1.8.2 or later
- Clone the deploy repo
git clone https://github.com/datahq/deploy.git
- Change to the deploy directory
cd deploy
- Switch to the minikube environment
source switch_environment.sh minikube
All code assumes you are inside a bash shell with required dependencies and connected ot the relevant environment
Deployments are managed using Helm
Initialize the Helm server side component
kubectl create -f rbac-config.yaml
helm init --service-account tiller --upgrade --force-upgrade --history-max 2
Deploy all charts (if dry run succeeds)
./helm_upgrade_all.sh --install --debug --dry-run && ./helm_upgrade_all.sh --install
You can also upgrade a single chart
./helm_upgrade_external_chart.sh socialmap
The helm_upgrade scripts forward all arguments to the underlying helm upgrade
command, some useful arguments:
- For initial installation you should add
--install
- Depending on the changes you might need to add
--recreate-pods
or--force
- For debugging you can also use
--debug
and--dry-run
- Duplicate and modify an existing chart under
charts-external
directory - Setup the external app's continuous deployment
- Copy the relevant steps from an existing app's .travis.yml
- Also, suggested to keep deployment notes in the app's README.md
- Follow the app's README to setup Docker and GitHub credentials on Travis
You can create a new environment by copying an existing environment directory and modifying the values.
See the sk8s environments documentation for more details about environments, namespaces and clusters.
The default values are at values.yaml
- these are used in the chart template files (under templates
, charts
and charts-external
directories)
Each environment can override these values using environments/ENVIRONMENT_NAME/values.yaml
Finally, automation scripts write values to values.auto-updated.yaml
Secrets are stored and managed directly in kubernetes and are not managed via Helm.
You can modify them manually or automatically
- You will need to
.env
file to be placed in root directory (see.env.template
). - Install dotenv:
pip install python-dotenv
# Update secrets for all services
python update_secrets.py update
# Or update secrets for specific servcie
python update_secrets.py update auth
Note: You may need to switch environment before updating switch_environment.sh
To update an existing secret, delete it first kubectl delete secret SECRET_NAME
After updating a secret you should update the affected deployments, you can use ./force_update.sh
to do that
All secrets should be optional so you can run the environment without any secretes and will use default values similar to dev environments.
Each environment may include a script to create the environment secrets under environments/ENVIRONMENT_NAME/secrets.sh
- this file is not committed to Git.
You can use the following snippet in the secrets.sh script to check if secret exists before creating it:
! kubectl describe secret <SECRET_NAME> &&\
kubectl create secret generic <SECRET_NAME> <CREATE_SECRET_PARAMS>
kubectl get pods
kubectl delete pod <<pod name>>
Note: If pod stuck in deletion (getting recreated by itself) try deleting helm release.
helm list
helm delete --purge <<release name>>
- Enable Travis for the repo (run
travis enable
from the repo directory) - Create a
.travis.yml
file based on existing file and modify according to your requirements
Depending on what you intend to do in your continuous deployment script you may need some of the following:
To connect and run commands on a Google Kubernetes Engine environment:
- Create a Google Compute Cloud service account, download the service account json file
- set the service account json on the app's travis
travis encrypt-file environments/datahub-testing/secret-k8s-ops.json environments/datahub-testing/deploy-ops-secret.json.enc --org
- Copy the
openssl
command output by the above command and modify in the .travis-yml - The -out param should be
-out k8s-ops-secret.json
To push changes to GitHub
- Create a GitHub machine user according to these instructions.
- Give this user write permissions to the k8s repo.
- Add the GitHub machine user secret key to travis on the app's repo:
travis env set --private K8S_OPS_GITHUB_REPO_TOKEN "*****"
To build and push docker images
travis env set --private DOCKER_USERNAME "***"
travis env set --private DOCKER_PASSWORD "***"
code- `docker run -it --entrypoint bash -e OPS_REPO_SLUG=datahq/# The DataHQ Kubernetes Environment
This approach of infrastructure deployment on kubernetes clusters is heavily based on the https://github.com/OpenBudget/budgetkey-k8s example.
Many thanks to OriHoch for making our lives easier.