Skip to content

Latest commit

 

History

History
35 lines (20 loc) · 1.37 KB

README.md

File metadata and controls

35 lines (20 loc) · 1.37 KB

docker-spark

Docker Image for Apache Spark

GitHub version License: MIT

Overview

The image is built on top of OpenJDK (8-jdk). The latest version (2.4.4) of Apache Spark is installed in this image. Additionally, ssh has been installed and set-up to be executed password-less. For Apache Spark to be deployed in cluster mode, password-less ssh setup is mandatory.

Running in Local mode:

Apache Spark can be executed in local mode without any additional setup.

Start the Docker image by executing the Docker run command passing /bin/bash command option to start an interactive shell session.

docker run -it docker-spark /bin/bash

Once an interactive shell session has been established for the Docker container, Spark shell can be started to execute code against Spark in local mode using Scala programming language as follows:

$SPARK_HOME/bin/spark-shell --master local[2]

Supported Apache Spark Versions:

Apache Spark latest [v2.4.4]

Dockerfile for Apache Spark v2.4.4
Dockerfile for Apache Spark v2.3.4