This page shows how to build an all-in-one Docker image for running Hive 3 on MR3, Apache Ranger, and Timeline Server.

Installing Hive on MR3

Download a Hadoop binary distribution and uncompress it. In our scenario, we use MR3 release base2.7, so Hadoop 2.7.7 works okay.

$ wget https://archive.apache.org/dist/hadoop/common/hadoop-2.7.7/hadoop-2.7.7.tar.gz
$ gunzip -c hadoop-2.7.7.tar.gz | tar xvf -

Download a pre-built MR3 release and uncompress it. Below we choose a pre-built MR3 release based on Hive 3.1.2.

$ wget https://github.com/mr3project/mr3-release/releases/download/v1.1/hivemr3-1.1-hive3.1.2.tar.gz
$ gunzip -c hivemr3-1.1-hive3.1.2.tar.gz | tar xvf -;
$ mv hivemr3-1.1-hive3.1.2 mr3-run
$ cd mr3-run

Update the environment variable HADOOP_HOME_LOCAL in env.sh so that it points to the installation directory of the Hadoop binary distribution.

$ vi env.sh

export HADOOP_HOME=/home/gla/hadoop-2.7.7

Build Timeline Server

Download the source code of Timeline Server from Timeline Server for MR3 on Kubernetes (using the master-mr3 branch), which is a variant of Timeline Server included in the Hadoop 2.7.7 distribution.

$ git clone https://github.com/mr3project/timeline-mr3.git

Then set the following environment variable in env.sh to specify the directory of the source code:

# Timeline-MR3 source directory
ATS_MR3_SRC=~/timeline-mr3

To compile Timeline Server, install protoc for Protobuf 2.5 and execute ats/compile-ats.sh.

$ ats/compile-ats.sh

Build an all-in-one Docker image

To build an all-in-one Docker image, execute build-docker.sh.

$ ./build-docker.sh

The script downloads the source code of Ranger and builds it, which may take up to a few hours. In the end, the user can find a new Docker image called hivemr3-all.

$ docker images
REPOSITORY                   TAG                 IMAGE ID            CREATED             SIZE
hivemr3-all                  latest              266818673b04        28 seconds ago      1.32GB