This is a project demonstrating deployment skills and strategies taught in the Udacity course "Site Reliability Engineer". The original repo is nd087-c3-deployment-roulette
Step by step explanation of how to get a dev environment running.
The AWS environment will be built in the us-east-2
region of AWS
-
Set up your AWS credentials from Udacity AWS Gateway locally
https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-files.html
-
From the AWS console manually create an S3 bucket in
us-east-2
calledudacity-tf-<your_name>
e.gudacity-tf-treboder
- The click
create bucket
- Update
_config.tf
with your S3 bucket name
- The click
-
Deploy Terraform infrastructure
cd starter/infra
terraform init
terraform apply
-
Setup Kubernetes config so you can ping the EKS cluster
aws eks --region us-east-2 update-kubeconfig --name udacity-cluster
- Change Kubernetes context to the new AWS cluster
kubectl config use-context <cluster_name>
- e.g
arn:aws:eks:us-east-2:139802095464:cluster/udacity-cluster
- e.g
- Confirm with:
kubectl get pods --all-namespaces
- Change context to
udacity
namespacekubectl config set-context --current --namespace=udacity
-
Follow the exercise instructions below
-
Clean up the environment with the
nuke_everything.sh
script or run the steps individually
cd starter/infra
terraform state rm kubernetes_namespace.udacity && terraform state rm kubernetes_service.blue
eksctl delete iamserviceaccount --name cluster-autoscaler --namespace kube-system --cluster udacity-cluster --region us-east-2
kubectl delete all --all -n udacity
terraform destroy
You might want to use metrics-server
to see some basic statistics.
- Install with
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
- Verify with
kubectl get deployment metrics-server -n kube-system
- Show pods and their basic stats with
kubectl top pods --sort-by=memory
You may visualize your AWS EKS cluster in exercise 3 using the helm chart kube-ops-view
-
Install helm
-
Add the stable repo:
helm repo add stable https://charts.helm.sh/stable
-
Install the helm chart
kube-ops-view
helm install kube-ops-view
or
helm install stable/kube-ops-view --set service.type=LoadBalancer
with results in failure ("ensure CRDs installed")
--set rbac.create=True
-
Confirm the helm chart is installed successfully
helm list
-
Get the service url to view the cluster dashboard
kubectl get svc kube-ops-view | tail -n 1 | awk '{ print "Kube-ops-view URL = http://"$4 }'
To remove this deployment use: helm uninstall kube-ops-view
- Deploy the app by running
kubectl apply -f .\hello.yml
- Check all running pods with
kubectl get pods
and grab the<hello-world-pod-name>
- Get the details for the pod with
kubectl describe pod <hello-world-pod-name>
and check the status - Go to AWS and get the load balancer DNS name, which points to the two EC2 instances serving the hello-world app
- Check that the hello-world application is returning the logs healthy! by running
kubectl logs <pod_name>
- Check the hello-world app with your browser and see
- Clean up the with
kubectl delete -f .\hello.yml
- Ensure you have connectivity to your local kubernetes cluster
kubectl config use-context docker-desktop
- Optional: Permanently switch namespace with
kubectl config set-context --current --namespace=udacity
and you do not need-n udacity
at the end of every command - Apply the
index_v1_html.yml
&index_v2_html.yml
configmaps to deploy the service html templates.- Run
kubectl apply -f index_v1_html.yml
- Run
kubectl apply -f index_v2_html.yml
- Check with
kubectl get configmap -n udacity
- Run
- Deploy the service to the cluster (
canary-svc.yml
)- Run
kubectl apply -f .\canary-svc.yml
- Check with
kubectl get service -n udacity
- Run
- Deploy the v1 & v2 starter template to the cluster (
canary-v1.yml
,canary-v2.yml
)- Run
kubectl apply -f .\canary-v1.yml
(container with v1 started right now) - Run
kubectl apply -f .\canary-v2.yml
(You'll notice v2 has0
replicas)
- Run
- Get the service cluster ip address and curl it 5 times to confirm only v1 of the application is reachable
kubectl get service canary-svc -n udacity
- Use an ephermeral container to access the kubernetes internal network
kubectl run debug --rm -i --tty --image nicolaka/netshoot -- /bin/bash
curl <service_ip>
and see<html><h1>This is version 1</h1></html
- Now we will initiate a canary deployment for
canary-v2
via a bash script- Run the bash script
./canary.sh
- Check that the procedure replaced all the containers with the new nginx version
- Run the bash script
- During the first manual verification step ensure you can curl the service and get a response from both versions of the application.
- Then continue until all replicas of v2 are deployed
- Tear down environment with
kubectl delete all --all -n udacity
- Log into your student AWS account and switch to region
us-east-2
- Setup your local aws credentials
- Launch the kubernetes cluster in starter terraform code provided
terraform init
terraform plan
terraform apply
optionally with parameter--auto-approve
- Ensure you have connectivity to your aws kubernetes cluster
1.
aws eks --region us-east-2 update-kubeconfig --name udacity-cluster
2.Change Kubernetes context to the new AWS clusterkubectl config use-context arn:aws:eks:us-east-2:225791329475:cluster/udacity-cluster
- Confirm with:
kubectl get pods --all-namespaces
- Change context to
udacity
namespacekubectl config set-context --current --namespace=udacity
- Apply the
index_blue_html.yml
&index_green_html.yml
configmaps to deploy the service html templates.- Run
kubectl apply -f .\index_blue_html.yml
- Run
kubectl apply -f .\index_green_html.yml
- Check with
kubectl get configmap
- Run
- Deploy the blue application to the cluster
blue.yml
- Run
kubectl apply -f .\blue.yml
spawns the pods with app "blue" - Check with
kubectl get pods
shows the running pods with app "blue"
- Run
- Check the "blue" deployment via curl
- Get the external ip from the "blue" load balancer with
kubectl get svc
- Run
curl <external_ip_of_loadbalancer>
to see<html><h1>This is version BLUE</h1></html>
- Or simply start your browser of choice:
<external_ip_of_loadbalancer>
- Get the external ip from the "blue" load balancer with
- You'll notice there is a load balancer service created for you in the
kubernetes_resources.tf
- There is also an associated dns zone
udacityexercise
indns.tf
that allows you to curl the hostnameblue-green.udacityexercise
from an ec2 instance - Confirm you can curl this hostname from the created
curl-instance
ec2 instance (also created via terraform before)- Connect to the ec2 instance via EC2 Instance Connect
- Then
curl blue-green.udacityproject
- There is also an associated dns zone
- Deploy the "green" app by executing the shell script
blue-green.sh
, which effectively:
kubectl apply -f ./index_green_html.yml
kubectl apply -f ./green.yml
- Confirm that the "green" app is deployed
- Run
kubectl get pods
to see the green pods running - Get the external ip from the "green" load balancer with
kubectl get svc
- Run
curl <external_ip_of_loadbalancer>
to see<html><h1>This is version GREEN</h1></html>
- Or simply start your browser of choice:
<external_ip_of_loadbalancer>
- Run
- Check that both services are running using the
curl instance
- Connect to the ec2 instance via EC2 Instance Connect
- Then
curl blue-green.udacityproject
- Simulate a failover event to the
green
environment by destroying the blue environmentkubectl delete -f .\blue.yml
does the job
- Ensure the
blue-green.udacityproject
record now only returns the green environment- curl
blue-green.udacityproject
viacurl instance
- curl
- Clean up the with
kubectl delete all --all -n udacity
- Log into your student AWS account and switch to region
us-east-2
- Setup your local aws credentials
- Launch the kubernetes cluster in starter terraform code provided
terraform init
terraform plan
terraform apply
- Ensure you have connectivity to your aws kubernetes cluster
1.
aws eks --region us-east-2 update-kubeconfig --name udacity-cluster
2.Change Kubernetes context to the new AWS clusterkubectl config use-context <cluster_name>
(e.g. arn:aws:eks:us-east-2:225791329475:cluster/udacity-cluster)
- Confirm with:
kubectl get pods --all-namespaces
- Change context to
udacity
namespacekubectl config set-context --current --namespace=udacity
- Launch the
bloatware.yml
application on the clusterkubectl apply -f bloatware.yml
- Take a screenshot of the running pods:
kubectl get pods -n udacity
- You'll notice NOT all the pods are in running state (AWS cluster can't support all of them with the initial single node).
- Identity the problem with them using the
kubectl describe
command - e.g
kubectl describe pod <name_of_pod>
- you'll notice at the bottom in events
0/2 nodes are available ...
- Identity the problem with them using the
- Clean up the with
kubectl delete all --all -n udacity
Manual scaling as a workaround in case creation of the service user fails
- To resolve this problem manually increase the cluster node size via terraform and apply
nodes_desired_size = 4
nodes_max_size = 10
nodes_min_size = 1
- Wait 5 mins then take a screenshot of the running pods:
kubectl get pods -n udacity
. You'll notice the pods that were in a pending are now able to be deployed successfully with the increased resources available to the cluster.
NOTE All AWS infrastructure changes outside of the EKS cluster can be made in the project terraform code
-
[Deployment Troubleshooting]
A previously deployed microservice
hello-world
doesn't seem to be reachable at its public endpoint. The product teams need you to fix this asap!- The
apps/hello-world
deployment is facing deployment issues.- Assess, identify and resolve the problem with the deployment
- Document your findings via screenshots or text files.
- The
-
[Canary deployments]
- Create a shell script
canary.sh
that will be executed by GitHub actions. - Canary deploy
/apps/canary-v2
so they take up 50% of the client requests - Curl the service 10 times and save the results to
canary.txt
- Ensure that it is able to return results for both services
- Provide the output of
kubectl get pods --all-namespaces
to show deployed services
- Create a shell script
-
[Blue-green deployments]
The product teams want a blue-green deployment for the
green
version of the/apps/blue-green
microservice because they heard it's even safer than canary deployments- Create a shell script
blue-green.sh
that executes agreen
deployment for the serviceapps/blue-green
- mimic the blue deployment configuration and replace the
index.html
with the values ingreen-config
config-map - The bash script will wait for the new deployment to successfully roll out and the service to be reachable.
- Create a new weighted CNAME record
blue-green.udacityproject
in Route53 for the green environment - Use the
curl
ec2 instance to curl theblue-green.udacityproject
url and take a screenshot to document that green & blue services are reachable- The screenshot should be named
green-blue.png
- The screenshot should be named
- Simulate a failover event to the
green
environment by destroying the blue environment - Ensure the
blue-green.udacityproject
record now only returns the green environment- curl
blue-green.udacityproject
and take a screenshot of the results namedgreen-only.png
from thecurl
ec2 instance
- curl
- Create a shell script
-
[Node elasticity]
A microservice
bloaty-mcface
must be deployed for compliance reasons before the company can continue business. Ensure it is deployed successfully- Deploy
apps/bloatware
microservice - Identify if the application deployment was successful and if not resolve any issues found
- Take a screenshot of the reason why the deployment was not successful
- Provide code or Take a screenshot of the resolution step
- Provide the output of
kubectl get pods --all-namespaces
to show deployed services
- Deploy
-
[Observability with metrics]
You have realized there is no observability in the Kubernetes environment. You suspect there is a service unnecessarily consuming too much memory and needs to be removed
- Install a metrics server on the kubernetes cluster and identify the service using up the most memory
- Take a screenshot of the output of the metrics command used to a file called
before.png
- Document the name of the application using the most memory in a text file called
high_memory.txt
- Take a screenshot of the output of the metrics command used to a file called
- Delete the service with the most memory usage from the cluster
- Take a screenshot of the output of the same metrics command to a file called
after.png
- Take a screenshot of the output of the same metrics command to a file called
- Install a metrics server on the kubernetes cluster and identify the service using up the most memory
-
[Diagramming the cloud landscape with Bob Ross]
In order to improve the onboarding of future developers. You decide to create an architecture diagram so that they don't have to learn the lessons you have learnt the hard way.- Create an architectural diagram that accurately describes the current status of your AWS environment.
- Make sure to include your AWS resources like the EKS cluster, load balancers
- Visualize one or two deployments and their microservices
- Create an architectural diagram that accurately describes the current status of your AWS environment.