Using a pre-built Docker image

To use a pre-built Docker image from DockerHub, the user can use an MR3 release containing all the executable scripts from the GitHub repository (

We recommend the quick start guide On Kubernetes which demonstrates how to use a pre-built Docker Image.

Building a new Docker image

The user can build a Docker image for running Spark on MR3 on Kubernetes. We assume that the user can execute the command docker so as to build a Docker image.

The first step is to set up Spark on MR3.

The next step is to collect all necessary files in the directory kubernetes/spark by executing which copies the script and jar files from the Spark installation (specified by SPARK_HOME in

$ ls kubernetes/spark/mr3/mr3lib/       # MR3 jar file
$ ls kubernetes/spark/spark/sparkmr3/   # Spark-MR3 jar file
$ ls kubernetes/spark/spark/bin/        # Spark scripts
$ ls kubernetes/spark/spark/jars/       # Spark jar files

Next the user should set two environment variables in kubernetes/spark/ (not in the installation directory):

$ vi kubernetes/spark/

  • DOCKER_SPARK_IMG is the full name of the Docker image including a tag. It specifies the name of the Docker image for running Spark on MR3 which may include the address of a running Docker server.
  • DOCKER_USER should match the user specified in kubernetes/spark/Dockerfile (which is root by default).

The last step is to build a Docker image from Dockerfile in the directory kubernetes/spark/ by executing kubernetes/

$ kubernetes/