Skip to content

saniyatech/docker-spark

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 

Repository files navigation

docker-spark

Docker Image for Apache Spark

GitHub version License: MIT

Overview

The image is built on top of OpenJDK (8-jdk). The latest version (2.4.4) of Apache Spark is installed in this image. Additionally, ssh has been installed and set-up to be executed password-less. For Apache Spark to be deployed in cluster mode, password-less ssh setup is mandatory.

Running in Local mode:

Apache Spark can be executed in local mode without any additional setup.

Start the Docker image by executing the Docker run command passing /bin/bash command option to start an interactive shell session.

docker run -it docker-spark /bin/bash

Once an interactive shell session has been established for the Docker container, Spark shell can be started to execute code against Spark in local mode using Scala programming language as follows:

$SPARK_HOME/bin/spark-shell --master local[2]

Supported Apache Spark Versions:

Apache Spark latest [v2.4.4]

Dockerfile for Apache Spark v2.4.4
Dockerfile for Apache Spark v2.3.4