Skip to content

Latest commit

 

History

History
109 lines (72 loc) · 2.32 KB

DemoSteps.md

File metadata and controls

109 lines (72 loc) · 2.32 KB

Steps for Running LDA Demo

Install minikube and helm using homebrew (cmd: brew install)

Create new minikube (By default it will start with 3CPU and 4096 memory)

$ minikube start 
$ minikube dashboard

Initialize Helm - this will deploy the Tiller Pod in the cluster

$ helm init  

Add the repository that contains the spark-operator Helm chart to local Helm repository

$ helm repo add incubator http://storage.googleapis.com/kubernetes-charts-incubator

Deploy the Spark Operator into the cluster. Need to use the --set parameter because the default version in helm chart is not the latest. Need to enable the WebHook for mounting the volumes and applying Spark Config value

$ helm install incubator/sparkoperator --namespace spark-operator \
--set operatorVersion=v2.4.0-v1beta1-latest \
--set enableWebhook=true

Create serviceAccount for our spark applications, grant the account a ClusterRole with permissions that Spark Driver pod needs to control Executor pods. (Located in ⁨spark-on-k8s-operator-master⁩/manifest)

$ kubectl apply -f spark-rbac.yaml

Create directory on Minikube for saving Spark Events for History Server

$ minikube ssh
$ sudo mkdir /mnt/data
$ exit

Create Persistent Volume and Persistent Volume Claim for Spark Events

$ kubectl apply -f pv-volume.yaml
$ kubectl apply -f pvc-claim.yaml 

Run the LDA Demo that reads and writes to the persistent volume on minikube.

$ kubectl apply -f lda-demo-local.yaml

Use kubectl port-forward to access the Spark UI

$ kubectl port-forward lda-demo-local-driver 4040:4040

Install History Server, which will read events from /mnt/data

$ helm install stable/spark-history-server --namespace default \
--set pvc.existingClaimName=task-pv-claim \
--set pvc.eventsDir=

Open a browser to view the Spark History Server

$ minikube service <animal name>-spark-history-server

To list SparkApplications, run:

$ kubectl get sparkapplications 

To check events for the SparkApplication object, run:

$ kubectl describe sparkapplication lda-demo-local

To view the results of the LDA Demo view file on minikube:

$ minikube ssh
$ cat /mnt/data/LDAResults.txt

Before re-rerunning the application, delete it from the cluster:

$ kubectl delete sparkapplication lda-demo-local