Building Hive on MR3 |

Downloading an MR3 release

Download an MR3 release and uncompress it. We rename the new directory to mr3-run.

Hive 3 on MR3 with Java 17

$ wget https://github.com/mr3project/mr3-release/releases/download/v1.11/hivemr3-1.11-java17-hive3.1.3-k8s.tar.gz
$ gunzip -c hivemr3-1.11-java17-hive3.1.3-k8s | tar xvf -;
$ mv hivemr3-1.11-java17-hive3.1.3-k8s mr3-run
$ cd mr3-run

Then the user can rebuild Hive on MR3 from the source code of two additional components: Tez for MR3 and Hive for MR3.

Cloning GitHub repositories

For Tez for MR3, clone the GitHub repository (https://github.com/mr3project/tez-mr3.git) and check out the branch corresponding to the MR3 release.

Tez for MR3

$ git clone https://github.com/mr3project/tez-mr3.git -b master-java17 --single-branch tez-mr3

For Hive for MR3, clone the GitHub repository (https://github.com/mr3project/hive-mr3.git) and check out the branch corresponding to the Hive version.

Hive 3 for MR3

$ git clone https://github.com/mr3project/hive-mr3.git -b master3 --single-branch hive-mr3

Setting environment variables

Set the following environment variables in env.sh in the MR3 release to specify the directories of the source code.

$ vi mr3-run/env.sh

TEZ_SRC=~/tez-mr3

HIVE3_SRC=~/hive-mr3

Compiling

Because of the compilation dependency between Hive and Tez, the user should rebuild first Tez for MR3 and then Hive for MR3. To compile Tez for MR3, execute tez/compile-tez.sh in the MR3 release. In order to access Amazon S3 (on Amazon EMR or EKS), use an additional option -P aws.

Tez for MR3

$ mr3-run/tez/compile-tez.sh

Tez for MR3 with access to Amazon S3

$ mr3-run/tez/compile-tez.sh -P aws

To compile Hive for MR3, execute hive/compile-hive.sh in the MR3 release with the following option:

Hive 3 for MR3

$ mr3-run/hive/compile-hive.sh --hivesrc3

The user can append as many Maven options as necessary to the command. These scripts invoke Maven to compile the source code, and automatically update the local Maven repository as well as hive/hivejar and tez/tezjar directories in the MR3 release. On Hadoop, they also upload the new jar files to HDFS, so the user does not need to execute mr3/upload-hdfslib-mr3.sh and tez/upload-hdfslib-tez.sh in the MR3 release later.