Using a pre-built Docker image
To use a pre-built Docker image from DockerHub, it suffices to use the MR3 release containing all the executable scripts from the GitHub repository (https://github.com/mr3project/mr3-run-k8s). Clone the repository.
$ git clone https://github.com/mr3project/mr3-run-k8s.git
That’s all you need!
We recommend the quick start guide On Kubernetes which demonstrates how to use a pre-built Docker Image.
Building a new Docker image (Optional)
The user can build a Docker image for running Spark on MR3 on Kubernetes.
We assume that the user can execute the command
docker so as to build a Docker image.
The first step is to set up Spark on MR3.
The next step is to collect all necessary files in the directory
kubernetes/spark by executing
which copies the script and jar files from the Spark installation (specified by
$ ls kubernetes/spark/mr3/mr3lib/ # MR3 jar file
$ ls kubernetes/spark/spark/sparkmr3/ # Spark-MR3 jar file
$ ls kubernetes/spark/spark/bin/ # Spark scripts
$ ls kubernetes/spark/spark/jars/ # Spark jar files
Next the user should set two environment variables in
env.sh in the installation directory):
$ vi kubernetes/spark/env.sh
DOCKER_SPARK_IMGis the full name of the Docker image including a tag. It specifies the name of the Docker image for running Spark on MR3 which may include the address of a running Docker server.
DOCKER_USERshould match the user specified in
The last step is to build a Docker image from
Dockerfile in the directory
kubernetes/spark/ by executing