Starting from the 0.2.0 version Pipeline supports managed Kubernetes clusters on Azure AKS as well.
For simplicity the instruction steps are presented through an example specifically how to hook a Spark application into a CI/CD workflow to run it on managed Kubernetes AKS/Azure.
It's assumed that the source of the Spark application is stored in GitHub.
The Pipeline Control Plane takes care of creating a Kubernetes cluster on the desired cloud provider and executing the steps of the CI/CD flow can be hosted on both AWS
and Azure
. See details below for how to launch Pipeline Control Plane
on AWS
and Azure
.
Hosting Pipeline Control Plane
and creating Kubernetes clusters on AWS
- AWS account
- AWS EC2 key pair
Hosting Pipeline Control Plane
and creating Kubernetes clusters on Azure
- Azure subscription with AKS service enabled.
- Obtain a Client Id, Client Secret and Tenant Id for a Microsoft Azure Active Directory. These information can be retrieved from the portal, but the easiest and fastest way is to use the Azure CLI tool.
$ curl -L https://aka.ms/InstallAzureCli | bash
$ exec -l $SHELL
$ az login
You should get something like:
{
"appId": "1234567-1234-1234-1234-1234567890ab",
"displayName": "azure-cli-2017-08-18-19-25-59",
"name": "http://azure-cli-2017-08-18-19-25-59",
"password": "1234567-1234-1234-be18-1234567890ab",
"tenant": "7654321-1234-1234-ee18-9876543210ab"
}
appId
is the Azure Client Idpassword
is the Azure Client Secrettenant
is the Azure Tenant Id
In order to get Azure Subscription Id run:
az account show --query id
Register an OAuth application on GitHub for the Pipeline CI/CD workflow.
Fill in Authorization callback URL
with some dummy value at this stage. This field will be updated once the Control Plane is up and running using the IP address or the DNS name.
Take note of the Client ID
and Client Secret
as these will be required for launching the Pipeline Control Plane.
The easiest way for running a Pipeline Control Plane is to use a Cloudformation template.
-
Navigate to: https://eu-west-1.console.aws.amazon.com/cloudformation/home?region=eu-west-1#/stacks/new
-
Select
Specify an Amazon S3 template URL
and add the URL to our templatehttps://s3-eu-west-1.amazonaws.com/cf-templates-grr4ysncvcdl-eu-west-1/2018026em9-new.templatee93ate9mob7
-
Fill in the following fields on the form:
-
Stack name
-
AWS Credentials
- Amazon access key id - specify your access key id
- Amazon secret access key - specify your secret access key
-
Azure Credentials and Information - needed only for creating Kubernetes clusters on
Azure
- AzureClientId - see how to get Azure Client Id above
- AzureClientSecret - see how to get Azure Client Secret above
- AzureSubscriptionId - your Azure Subscription Id
- AzureTenantId - see how to get Azure Client Tenant Id above
-
Control Plane Instance Config
- InstanceName - name of the EC2 instance that will host the Control Plane
- ImageId - pick the image id from the README
- KeyName - specify your AWS EC2 key pair
-
Banzai Pipeline Credentials
-
Banzai-Ci Credentials
- Orgs - comma-separated list of Github organizations whose members to grant access to use Banzai Cloud Pipeline's CI/CD workflow
- Github Client - GitHub OAuth
Client Id
- Github Secret - Github OAuth
Client Secret
-
Grafana Dashboard
- Grafana Dashboard Password - specify password for accessing Grafana dashboard with defaults specific to the application
-
Prometheus Dashboard
- Prometheus Password - specify password for accessing Prometheus that collects cluster metrics
-
Advanced Pipeline Options
- PipelineImageTag - specify
0.2.0
for using current stable Pipeline release.
- PipelineImageTag - specify
-
Slack Credentials
- this section is optional. Complete this section to receive cluster related alerts through a Slack push notification channel.
-
Alert SMTP Credentials
- this section is optional. Fill this section to receive cluster related alerts through email.
-
-
Finish the wizard to create a
Control Plane
instance. -
Take note of the PublicIP of the created Stack. We refer to this as the PublicIP of
Control Plane
. -
Go back to the earlier created GitHub OAuth application and modify it. Set the
Authorization callback URL
field tohttp://{control_plane_public_ip}/authorize
The easiest way for running a Pipeline Control Plane is deploying it using an ARM template.
-
Navigate to: https://portal.azure.com/#create/Microsoft.Template
-
Click
Build your own template in editor
and copy-paste the content of ARM template into the editor then clickSave
-
Resource group - We recommend creating a new
Resource Group
for the deployment as later will be easier to clean up all the resources created by the deployment -
Specify SSH Public Key
-
SMTP Server Address/User/Password/From
- these are optional. Fill this section to receive cluster related alerts through email.
-
Slack Webhook Url/Channel
- this section is optional. Complete this section to receive cluster related alerts through a Slack push notification channel.
-
Banzai Pipeline Credentials
-
Prometheus Dashboard
- Prometheus Password - specify password for accessing Prometheus that collects cluster metrics
-
Grafana Dashboard
- Grafana Dashboard Password - specify password for accessing Grafana dashboard with defaults specific to the application
-
Banzai-Ci Credentials
- Orgs - comma-separated list of Github organizations whose members to grant access to use Banzai Cloud Pipeline's CI/CD workflow
- Github Client - GitHub OAuth
Client Id
- Github Secret - Github OAuth
Client Secret
-
Azure Credentials and Information
- Azure Client Id - see how to get Azure Client Id above
- Azure Client Secret - see how to get Azure Client Secret above
- Azure Subscription Id - your Azure Subscription Id
- Azure Tenant Id - see how to get Azure Tenant Id above
-
Finish the wizard to create a
Control Plane
instance. -
Open the
Resource Group
that was specified for the deployment -
Take note of the PublicIP of the deployed
Control Plane
.
-
The steps of the workflow executed by the CI/CD flow are described in the .pipeline.yml
file that must be placed under the root directory of the source code of the Spark application. The file has to be pushed into the GitHub repo along with the source files of the application.
There is an example Spark application spark-pi-example that can be used for trying out the CI/CD pipeline.
Note: Fork this repository into your own repository for this purpose!).
For setting up your own spark application for the workflow you can start from the .pipeline.yml
configuration file from spark-pi-example
and customize it.
The following sections needs to be modified:
-
the command for building your application
remote_build: ... original_commands: - mvn clean package -s settings.xml
-
the Main class of your application
run: ... spark_class: banzaicloud.SparkPi
-
the name of your application
run: ... spark_app_name: sparkpi
-
the application artifact
This is the relative path to the
jar
of your Spark application. This is thejar
generated by the build commandrun: ... spark_app_source: target/spark-pi-1.0-SNAPSHOT.jar
-
the application arguments
run:
...
spark_app_args: 1000
Navigate to http://{control_plane_public_ip}
in your web browser and grant access for the organizations that contain the GitHub repositories that you want to hook into the CI/CD workflow. Then click authorize access.
All the services of the Pipeline may take some time to fully initialize, thus the page may not load at first. Please give it some time and retry.
Navigate to http://{control_plane_public_ip}
- this will bring you to the CI/CD user interface. Select Repositories
from top left menu. This will list all the repositories that the Pipeline has access to.
Select repositories desired to be hooked to the CI/CD flow.
For the hooked repositories set the following secrets :
-
plugin_endpoint
- specifyhttp://{control_plane_public_ip}/pipeline/api/v1
-
plugin_username
- specify the same user name as for Banzai Pipeline Credentials -
plugin_password
- specify the same password as for Banzai Pipeline Credentials
Modify the source code of your Spark application, commit the changes and push it to the repository on GitHub. The Pipeline gets notified through GitHub webhooks about the commits and will trigger the flow described in the .pipeline.yml
file of the watched repositories.
The running CI/CD jobs can be monitored and managed at http://{control_plane_public_ip}/account/repos
In order to check the logs of the CI/CD workflow steps, click on the desired commit message on the UI.
Once configured the Spark application will be built, deployed and executed for every commit pushed to the project's repository. The progress of the workflow can be followed by clicking on the small orange dot beside the commit on the GitHub UI.
Our git repos with example projects that contain pipeline workflow configurations: