Skip to content

Project with configuration files required to deploy an Amazon Elastic Kubernetes Service cluster with Apache Solr and Apache Zookeeper.

License

Notifications You must be signed in to change notification settings

hmuthusamy/amazon-eks-arch-apache-solr

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Apache Solr on Amazon Elastic Kubernetes Service

Latest Version License: MIT

This repo contains sample configuration files to install Apache Solr on Amazon Elastic Kubernetes Service (EKS). It also contains some files required to run the demo. This repository walks through the installation and configuration of the following components-

Apache Solr on Amazon EKS

Getting Started

Pre-requisites

Use the following steps to create the Solr environment

  1. From a terminal in your Cloud9 workspace, clone this git repository and set the directory:
git clone <repo_url> apache-solr-k8s-main
cd apache-solr-k8s-main/config
  1. Create an Amazon EKS cluster using. Note: replace <region of choice> with the AWS region you wish to deploy your EKS Cluster, for example --region=us-west-2.
eksctl create cluster --version=1.21 \
--name= solr8demo \
--region=<region of choice> \
--node-private-networking \
--alb-ingress-access \
--asg-access \
--without-nodegroup
  1. Create the Managed Node Groups in private subnets within the cluster using:

⚠️ The managed node groups config file uses EC2 instance type m5.xlarge which is not free tier eligible. Thus, your AWS account may also incur charges for EC2. For pricing details of Amazon Elastic Kubernetes Service refer the Amazon EKS pricing page.

eksctl create nodegroup -f managedNodegroups.yml
  1. Setup the Helm charts, and install Prometheus:
curl -sSL https://raw.githubusercontent.com/helm/helm/master/scripts/get-helm-3 | bash
helm repo add stable https://charts.helm.sh/stable/
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
  1. Install Kubernetes Metrics Server
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

Verify that the metrics-server deployment is running the desired number of pods with the following command.

kubectl get deployment metrics-server -n kube-system
  1. Install ZooKeeper for SolrCloud Zookeeper ensemble:
kubectl create configmap zookeeper-ensemble-config --from-env-file=zk-config.properties
kubectl apply -f zookeeper.yml 

Check status of the pods in the StatefulSet for Zookeeper by running the following command

kubectl get pods -l app=zk

Expected output should look like

NAME   READY   STATUS    RESTARTS   AGE
zk-0   1/1     Running   0          4h4m
zk-1   1/1     Running   0          4h3m
zk-2   1/1     Running   0          4h3m
  1. Install Solr and Solr-metrics exporter:
kubectl create configmap solr-cluster-config --from-env-file=solr-config.properties
kubectl apply -f solr-cluster.yml
kubectl apply -f solr-exporter.yml

Check status of the Solr pods

kubectl get pods -l app=solr-app

Expected output

NAME     READY   STATUS    RESTARTS   AGE
solr-0   1/1     Running   0          3h59m
solr-1   1/1     Running   0          3h59m
solr-2   1/1     Running   0          3h58m

Verify that the Solr Exporter service is running on port 9983. This is important since our HPA depends on Solr metrics to be exported to Kubernetes metrics server via Prometheus.

kubectl get service/solr-exporter-service

Expected output (Note: CLUSTER-IP will likely be different)

NAME                    TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)    AGE
solr-exporter-service   ClusterIP   10.100.205.122   <none>        9983/TCP   4h1m
  1. Update the prom.yml Prometheus configuration file with the solr-exporter-service IP and host port.

Find the solr-exporter-service cluster IP address using the command below

kubectl get service/solr-exporter-service -o jsonpath='{.spec.clusterIP}'

Update the prometheus.yml property in the prom.yml file as shown below and replace <solr-exporter-service-IP> with the cluster IP from above command. Save the file.

scrape_configs:
  - job_name: prometheus
      static_configs:
        - targets:
          - localhost:9090
  - job_name: solr
      scheme: http
        static_configs:
          - targets: ['<solr-exporter-service-IP>:9983']
  1. Install Prometheus adapter:
helm install prometheus-adapter prometheus-community/prometheus-adapter \
--set prometheus.url=http://prometheus-server.default.svc.cluster.local \
--set prometheus.port=80 \
--values=adapterConfig.yml

helm install prometheus prometheus-community/prometheus \
--values prom.yml
  1. Configure Horizontal Pod Autoscaler (HPA) and Cluster Autoscaler (CA) using kubectl:
kubectl apply -f hpa.yml
kubectl apply -f cluster-autoscaler-autodiscover.yaml

Verify HPA has been setup correctly-

kubectl describe hpa

Expected output

Name:                             solr-hpa
Namespace:                        default
Labels:                           <none>
Annotations:                      <none>
CreationTimestamp:                Wed, 22 Dec 2021 19:25:18 +0000
Reference:                        StatefulSet/solr
Metrics:                          ( current / target )
  "solr_metrics" (target value):  4021 / 50k
Min replicas:                     3
Max replicas:                     20
StatefulSet pods:                 20 current / 20 desired
Conditions:
  Type            Status  Reason            Message
  ----            ------  ------            -------
  AbleToScale     True    ReadyForNewScale  recommended size matches current size
  ScalingActive   True    ValidMetricFound  the HPA was able to successfully calculate a replica count from external metric solr_metrics(nil)
  ScalingLimited  True    TooManyReplicas   the desired replica count is more than the maximum replica count
Events:           <none>

⚠️ The "solr_metrics" value may be 0 or a lower number when setting up the Solr deployment. However, this number is expected to change when Solr receives client requests. Also note that the maxReplicas used in the hpa.yml config file is set to 10. You may consider changing this to meet the needs of your Solr deployment. maxReplicas defines the maximum number of pods the HPA can scale up to.

  1. Obtain SolrCloud Administration UI URL using kubectl get services solr-service from a terminal in your Cloud9 workspace. The URL will be of the form http://<xxxxx>.<region>.elb.amazonaws.com:8983.

K8s services

  1. Create a Solr Collection named Books using the Solr Administration UI and upload the sample data file data/books.json.

Apache Solr collection

  1. Cnfigure SolrCloud autoscaler by setting a Search Rate Trigger. The autoscaler config can be set using the endpoint http://<xxxxx>.<region>.elb.amazonaws.com:8983/api/cluster/autoscaling/:
curl -X POST -H 'Content-type:application/json' -d '{
            "set-trigger": {
                  "name" : "search_rate_trigger",
                  "event" : "searchRate",
                  "collections" : "Books",
                  "metric" : "QUERY./select.requestTimes:1minRate",
                  "aboveRate" : 10.0,
                  "belowRate" : 0.01,
                  "waitFor" : "30s",
                  "enabled" : true,
                  "actions" : [
                        {
                        "name" : "compute_plan",
                        "class": "solr.ComputePlanAction"
                        },
                        {
                        "name" : "execute_plan",
                        "class": "solr.ExecutePlanAction"
                        }
                  ]
            }
}' http://<xxxxx>.<region>.elb.amazonaws.com:8983/api/cluster/autoscaling/

Testing the deployment

A Python script is included in the scripts directory which can be used to test the deployment.

  1. Change directory
cd scripts
chmod 744 ./submit_mc_pi_k8s_requests_books.py
  1. Install the required dependencies
sudo python3 -m pip install -r ./requirements.txt
  1. Run the script
python ./submit_mc_pi_k8s_requests_books.py -p 1 -r 1 -i 1

To run a short load test the value of flags -p, -r, and -i can be increased

python ./submit_mc_pi_k8s_requests_books.py -p 100 -r 30 -i 30000000 > result.txt

Review the result.txt file to ensure you are getting search query responses from Solr.


Cleaning up

Use the following steps to clean up the Solr environment.

  1. Uninstall Horizontal Pod Autoscaler (HPA) and Cluster Autoscaler (CA):
kubectl delete -f hpa.yml
kubectl delete -f cluster-autoscaler-autodiscover.yaml
  1. Uninstall Solr:
kubectl delete -f solr-cluster.yml
kubectl delete configmap solr-cluster-config
kubectl delete -f solr-exporter.yml
  1. Uninstall Zookeeper:
kubectl delete -f zookeeper.yml 
kubectl delete configmap zookeeper-ensemble-config
  1. Delete the Managed Node Groups:
eksctl delete nodegroup -f managedNodegroups.yml
  1. Delete the Amazon EKS cluster:
eksctl delete cluster --name=solr8demo

Security

See CONTRIBUTING for more information.

License

This library is licensed under the MIT-0 License. See the LICENSE file.

About

Project with configuration files required to deploy an Amazon Elastic Kubernetes Service cluster with Apache Solr and Apache Zookeeper.

Resources

License

Code of conduct

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 100.0%