This page shows how to operate Hive on MR3 on Kubernetes with multiple nodes. By following the instruction, the user will learn:

  1. how to start Metastore
  2. how to run Hive on MR3

This scenario has the following prerequisites:

  1. A running Kubernetes cluster is available.
  2. A database server for Metastore is running.
  3. Either HDFS or S3 (or S3-compatible storage) is available for storing the warehouse. For using S3, access credentials are required.
  4. The user can either create a PersistentVolume or store transient data on HDFS or S3. The PersistentVolume should be writable to 1) the user with UID 1000, and 2) user nobody (corresponding to root user) if Ranger is to be used for authorization.
  5. Every worker node has an identical set of local directories for storing intermediate data (to be mapped to hostPath volumes). These directories should be writable to the user with UID 1000 because all containers run as non-root user with UID 1000.
  6. The user can run Beeline to connect to HiveServer2 running at a given address.

We use a pre-built Docker image mr3project/hive3:1.10 available at DockerHub, so the user does not have to build a new Docker image. In our example, we use a MySQL server for Metastore, but Postgres and MS SQL are also okay.

This scenario should take less than 30 minutes to complete, not including the time for downloading a pre-built Docker image.

For asking any questions, please visit MR3 Google Group or join MR3 Slack.

Overview

Hive on MR3 uses the following components: Metastore, HiveServer2, MR3 DAGAppMaster, and MR3 ContainerWorkers.

run-k8s-overview.png

  • The user can connect to a public HiveServer2 (via JDBC/ODBC) which is exposed to the outside of the Kubernetes cluster.
  • The user can connect directly to MR3-UI, Grafana, and Ranger.
  • Internally we run a Timeline Server to collect history logs from MR3 DAGAppMaster, and a Prometheus server to collect metrics from MR3 DAGAppMaster.

Hive on MR3 requires three types of storage:

  • Data source such as HDFS or S3
  • PersistentVolume (or HDFS/S3) for storing transient data
  • hostPath volumes for storing intermediate data in ContainerWorker Pods

hive.k8s.volume.setup.png

The hostPath volumes are created and mounted by MR3 at runtime for each ContainerWorker Pod. These hostPath volumes hold intermediate data to be shuffled (by shuffle handlers) between ContainerWorker Pods. In order to be able to create the same set of hostPath volumes for every ContainerWorker Pod, an identical set of local directories should be ready in all worker nodes (where ContainerWorker Pods are created).

Installation

Download an MR3 release containing the executable scripts.

$ git clone https://github.com/mr3project/mr3-run-k8s.git
$ cd mr3-run-k8s/kubernetes/

Configuring Metastore in env.sh

Open env.sh and set environment variables.

$ vi env.sh

HIVE_DATABASE_HOST=192.168.10.1
HIVE_DATABASE_NAME=hive3mr3
HIVE_WAREHOUSE_DIR=s3a://hivemr3/warehouse
HIVE_METASTORE_DB_TYPE=mysql
  • Set HIVE_DATABASE_HOST to the IP address of the MySQL server for Metastore. In our example, the MySQL server is running at the IP address 192.168.10.1.
  • Set HIVE_DATABASE_NAME to the database name for Metastore in the MySQL server.
  • Set HIVE_WAREHOUSE_DIR to the HDFS directory or the S3 bucket storing the warehouse (e.g., hdfs://hdfs.server:8020/hive/warehouse or s3a://hivemr3/hive/warehouse). Note that for using S3, we should use prefix s3a, not s3.
  • Set HIVE_METASTORE_DB_TYPE to the database type for Metastore which is used as an argument to schematool (mysql for MYSQL, postgres for Postgres, mssql for MS SQL).

Setting host aliases (optional)

To use host names (instead of IP addresses) when configuring Hive on MR3, the user should update:

  1. spec/template/spec/hostAliases field in yaml/metastore.yaml and yaml/hive.yaml
  2. configuration key mr3.k8s.host.aliases in conf/mr3-site.xml

For example, if the environment variables HIVE_DATABASE_HOST or HIVE_WAREHOUSE_DIR in env.sh uses a host name called orange0/orange1 with IP address 192.168.10.100/192.168.10.1, update as follows:

$ vi yaml/metastore.yaml

spec:
  template:
    spec:
      hostAliases:
      - ip: "192.168.10.100"
        hostnames:
        - "orange0"
      - ip: "192.168.10.1"
        hostnames:
        - "orange1"
$ vi yaml/hive.yaml

spec:
  template:
    spec:
      hostAliases:
      - ip: "192.168.10.100"
        hostnames:
        - "orange0"
      - ip: "192.168.10.1"
        hostnames:
        - "orange1"
$ vi conf/mr3-site.xml

<property>
  <name>mr3.k8s.host.aliases</name>
  <value>orange0=192.168.10.100,orange1=192.168.10.1</value>
</property>

Directory for storing transient data

The user can use either a PersistentVolume or HDFS/S3 to store transient data. For using a PersistentVoume, follow the instruction in 1. Creating and mounting PersistentVolume. For using HDFS/S3, follow the instruction in 2. Using HDFS/S3.

For running Hive on MR3 in production, using PersistentVolume is recommended.

1. Creating and mounting PersistentVolume

yaml/workdir-pv.yaml creates a PersistentVolume. The user should update it in order to use a desired type of PersistentVolume. In our example, we create a PersistentVolume using NFS, where the NFS server runs at 192.168.10.1 and uses a directory /home/nfs/hivemr3. The PersistentVolume should be writable to the user with UID 1000.

$ vi yaml/workdir-pv.yaml

spec:
  persistentVolumeReclaimPolicy: Delete
  nfs:
    server: "192.168.10.1"
    path: "/home/nfs/hivemr3"

yaml/workdir-pvc.yaml creates a PersistentVolumeClaim which references the PersistentVolume created by yaml/workdir-pv.yaml. The user should specify the size of the storage to be used by Hive on MR3:

$ vi yaml/workdir-pvc.yaml

spec:
  resources: 
    requests:
      storage: 10Gi

2. Using HDFS/S3

Set the configuration keys hive.exec.scratchdir and hive.query.results.cache.directory in conf/hive-site.xml to point to the directory on HDFS or S3 for storing transient data. Note that for using S3, we should use prefix s3a, not s3.

$ vi conf/hive-site.xml

<property>
  <name>hive.exec.scratchdir</name>
  <value>s3a://hivemr3/workdir</value>
</property>

<property>
  <name>hive.query.results.cache.directory</name>
  <value>s3a://hivemr3/workdir/_resultscache_</value>
</property>

Open env.sh and update the following two environment variables to empty values because we do not use PersistentVolumes.

$ vi env.sh

WORK_DIR_PERSISTENT_VOLUME_CLAIM=
WORK_DIR_PERSISTENT_VOLUME_CLAIM_MOUNT_DIR=

Set METASTORE_USE_PERSISTENT_VOLUME to false in env.sh.

METASTORE_USE_PERSISTENT_VOLUME=false

Open yaml/metastore.yaml and comment out the following lines:

$ vi yaml/metastore.yaml

# - name: work-dir-volume
#   mountPath: /opt/mr3-run/work-dir/

# - name: work-dir-volume
#   persistentVolumeClaim:
#     claimName: workdir-pvc

Open yaml/hive.yaml and comment out the following lines:

$ vi yaml/hive.yaml

# - name: work-dir-volume
#   mountPath: /opt/mr3-run/work-dir

# - name: work-dir-volume
#   persistentVolumeClaim:
#     claimName: workdir-pvc

Configuring S3 (optional)

For accessing S3-compatible storage, additional configuration keys should be set in conf/core-site.xml. Open conf/core-site.xml and set configuration keys for S3. The configuration key fs.s3a.endpoint should be set to point to the storage server.

$ vi conf/core-site.xml

<property>
  <name>fs.s3a.aws.credentials.provider</name>
  <value>com.amazonaws.auth.EnvironmentVariableCredentialsProvider</value>
</property>

<property>
  <name>fs.s3a.endpoint</name>
  <value>http://orange0:9000</value>
</property>

<property>
  <name>fs.s3a.path.style.access</name>
  <value>true</value>
</property>

The user may need to change the parameters for accessing S3 to avoid SdkClientException: Unable to execute HTTP request: Timeout waiting for connection from pool. For more details, see Troubleshooting.

When using the class EnvironmentVariableCredentialsProvider to read AWS credentials, add two environment variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY in env.sh.

$ vi env.sh

export AWS_ACCESS_KEY_ID=_your_aws_access_key_id_
export AWS_SECRET_ACCESS_KEY=_your_aws_secret_secret_key_

Note that AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY are already appended to the values of the configuration keys mr3.am.launch.env and mr3.container.launch.env in conf/mr3-site.xml. For the security purpose, the user should NOT write AWS access key ID and secret access key in conf/mr3-site.xml.

$ vi conf/mr3-site.xml

<property>
  <name>mr3.am.launch.env</name>
  <value>LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$HADOOP_HOME/lib/native/,HADOOP_CREDSTORE_PASSWORD,AWS_ACCESS_KEY_ID,AWS_SECRET_ACCESS_KEY,USE_JAVA_17</value>
</property>

<property>
  <name>mr3.container.launch.env</name>
  <value>LD_LIBRARY_PATH=/opt/mr3-run/hadoop/apache-hadoop/lib/native,HADOOP_CREDSTORE_PASSWORD,AWS_ACCESS_KEY_ID,AWS_SECRET_ACCESS_KEY,USE_JAVA_17</value>
</property>

Public IP address for HiveServer2

The file yaml/hiveserver2-service.yaml creates a Service for exposing HiveServer2 to the outside of the Kubernetes cluster. The user should specify a public IP address for HiveServer2 so that clients can connect to it from the outside of the Kubernetes cluster. By default, HiveServer2 uses port 9852 for Thrift transport and port 10001 for HTTP transport. In our example, we expose HiveServer2 at 192.168.10.1.

$ vi yaml/hiveserver2-service.yaml

spec:
  ports:
  - protocol: TCP
    port: 9852
    targetPort: 9852
    name: thrift
  - protocol: TCP
    port: 10001
    targetPort: 10001
    name: http
  externalIPs:
  - 192.168.10.1

A valid host name for the IP address is necessary for secure communication using SSL.

Resources for Metastore and HiveServer2 Pods

By default, we allocate 16GB of memory and 2 CPUs to Metastore and HiveServer2 Pods. To adjust resources, update env.sh, yaml/hive.yaml, and yaml/metastore.yaml.

$ vi env.sh

HIVE_SERVER2_HEAPSIZE=16384
HIVE_METASTORE_HEAPSIZE=16384
$ vi yaml/hive.yaml

spec:
  template:
    spec:
      containers:
        resources:
          requests:
            cpu: 2
            memory: 16Gi
          limits:
            cpu: 2
            memory: 16Gi
$ vi yaml/metastore.yaml

spec:
  template:
    spec:
      containers:
        resources:
          requests:
            cpu: 2
            memory: 16Gi
          limits:
            cpu: 2
            memory: 16Gi

Resources for DAGAppMaster Pod

By default, we allocate 16GB of memory and 4 CPUs to a DAGAppMaster Pod. To adjust resources, update conf/mr3-site.xml.

$ vi conf/mr3-site.xml

<property>
  <name>mr3.am.resource.memory.mb</name>
  <value>16384</value>
</property>

<property>
  <name>mr3.am.resource.cpu.cores</name>
  <value>4</value>
</property>

Resources for mappers, reducers, and ContainerWorker Pods

Change resources to be allocated to each mapper, reducer, and ContainerWorker by updating conf/hive-site.xml. In particular, the configuration keys hive.mr3.all-in-one.containergroup.memory.mb and hive.mr3.all-in-one.containergroup.vcores should be adjusted so that a ContainerWorker can fit in a worker node. In our example, we allocate 4GB of memory to each mapper/reducer, and 16GB of memory to each ContainerWorker Pod.

$ vi conf/hive-site.xml

<property>
  <name>hive.mr3.map.task.memory.mb</name>
  <value>4096</value>
</property>

<property>
  <name>hive.mr3.map.task.vcores</name>
  <value>1</value>
</property>

<property>
  <name>hive.mr3.reduce.task.memory.mb</name>
  <value>4096</value>
</property>

<property>
  <name>hive.mr3.reduce.task.vcores</name>
  <value>1</value>
</property>

<property>
  <name>hive.mr3.all-in-one.containergroup.memory.mb</name>
  <value>16384</value>
</property>
  
<property>
  <name>hive.mr3.all-in-one.containergroup.vcores</name>
  <value>4</value>
</property>

For more details, see Performance Tuning.

Mounting hostPath volumes

As a prerequisite, every worker node where ContainerWorker Pods may run should have an identical set of local directories for storing intermediate data. These directories are mapped to hostPath volumes in each ContainerWorker Pod. Set the configuration key mr3.k8s.pod.worker.hostpaths to the list of local directories in conf/mr3-site.xml.

$ vi conf/mr3-site.xml

<property>
  <name>mr3.k8s.pod.worker.hostpaths</name>
  <value>/data1/k8s,/data2/k8s,/data3/k8s</value>
</property>

Docker image

By default, we use the pre-built Docker image mr3project/hive3:1.10. If the user wants to use a different Docker image, check the environment variables DOCKER_HIVE_IMG and DOCKER_HIVE_WORKER_IMG in env.sh and the field spec/template/spec/containers/image in yaml/metastore.yaml and yaml/hive.yaml.

$ vi env.sh

DOCKER_HIVE_IMG=mr3project/hive3:1.10
DOCKER_HIVE_WORKER_IMG=mr3project/hive3:1.10
$ vi yaml/metastore.yaml

spec:
  template:
    spec:
      containers:
      - image: mr3project/hive3:1.10
$ vi yaml/hive.yaml

spec:
  template:
    spec:
      containers:
      - image: mr3project/hive3:1.10

Configuring Metastore

Open conf/hive-site.xml and update configurations for Metastore as necessary. Below we list some of configuration keys that the user should check. The two configuration keys javax.jdo.option.ConnectionUserName and javax.jdo.option.ConnectionPassword should match the user name and password of the MySQL server for Metastore. For simplicity, we disable security on the Metastore side.

$ vi conf/hive-site.xml

<property>
  <name>hive.metastore.db.type</name>
  <value>MYSQL</value>
</property>

<property>
  <name>javax.jdo.option.ConnectionDriverName</name>
  <value>com.mysql.cj.jdbc.Driver</value>
</property>

<property>
  <name>javax.jdo.option.ConnectionUserName</name>
  <value>root</value>
</property>

<property>
  <name>javax.jdo.option.ConnectionPassword</name>
  <value>passwd</value>
</property>

<property>
  <name>hive.metastore.pre.event.listeners</name>
  <value></value>
</property>
<property>
  <name>metastore.pre.event.listeners</name>
  <value></value>
</property>

For the purpose of testing, we do not need initiator and cleaner threads in Metastore.

$ vi conf/hive-site.xml

<property>
  <name>hive.compactor.initiator.on</name>
  <value>false</value>
</property>

By default, Metastore does not initialize schema. In order to initialize schema when starting Metastore, add --init-schema as an argument in yaml/metastore.yaml:

$ vi yaml/metastore.yaml

spec:
  template:
    spec:
      containers:
        args: ["start", "--kubernetes", "--init-schema"]

Remove nodeAffinity as we do not use node affinity rules.

$ vi yaml/metastore.yaml 

      affinity:
      # nodeAffinity:
      #   requiredDuringSchedulingIgnoredDuringExecution:
      #     nodeSelectorTerms:
      #     - matchExpressions:
      #       - key: roles
      #         operator: In
      #         values:
      #         - "masters"

Configuring HiveServer2

Check the configuration for authentication and authorization:

$ vi conf/hive-site.xml

<property>
  <name>hive.security.authenticator.manager</name>
  <value>org.apache.hadoop.hive.ql.security.HadoopDefaultAuthenticator</value>
</property>

<property>
  <name>hive.security.authorization.manager</name>
  <value>org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdConfOnlyAuthorizerFactory</value>
</property>

Running Metastore and HiveServer2

Before running Metastore and HiveServer2, the user should make sure that no ConfigMaps and Services exist in the namespace hivemr3. For example, the user may see ConfigMaps and Services left over from a previous run.

$ kubectl get configmaps -n hivemr3
NAME                       DATA   AGE
mr3conf-configmap-master   1      16m
mr3conf-configmap-worker   1      16m

$ kubectl get svc -n hivemr3
NAME                    TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)   AGE
service-master-1237-0   ClusterIP   10.105.238.21   <none>        80/TCP    11m
service-worker          ClusterIP   None            <none>        <none>    11m

In such a case, manually delete these ConfigMaps and Services.

$ kubectl delete configmap -n hivemr3 mr3conf-configmap-master mr3conf-configmap-worker
$ kubectl delete svc -n hivemr3 service-master-1237-0 service-worker

The user can start Metastore by executing the script run-metastore.sh. Before running Metastore, the script automatically downloads a MySQL connector from https://cdn.mysql.com/Downloads/Connector-J/mysql-connector-java-8.0.28.tar.gz. After a Metastore Pod is created, the user can start HiveServer2 by executing the script run-hive.sh.

$ ./run-metastore.sh
namespace/hivemr3 created
persistentvolume/workdir-pv created
persistentvolumeclaim/workdir-pvc created
...
statefulset.apps/hivemr3-metastore created
service/metastore created

$ ./run-hive.sh 
...
CLIENT_TO_AM_TOKEN_KEY=8674e6e1-3621-4565-9fc6-39f20e66e88e
MR3_APPLICATION_ID_TIMESTAMP=31905
MR3_SHARED_SESSION_ID=e35dc8a7-e2a3-4c26-8d08-0505083f4430
ATS_SECRET_KEY=0031e7b5-1ad3-410a-ae17-98df2e6c1654
Error from server (AlreadyExists): configmaps "client-am-config" already exists
deployment.apps/hivemr3-hiveserver2 created
service/hiveserver2 created

These scripts mount the following files inside the Metastore and HiveServer2 Pods:

  • env.sh
  • conf/*

In this way, the user can completely specify the behavior of Metastore and HiveServer2 as well as DAGAppMaster and ContainerWorkers.

In order to make any changes to these files effective, the user should restart Metastore and HiveServer2 after deleting existing ConfigMaps and Services in the namespace hivemr3.

HiveServer2 creates a DAGAppMaster Pod. Note, however, that no ContainerWorkers Pods are created until queries are submitted. Depending on the configuration for readiness probe, HiveServer2 may restart once before running normally. In our example, HiveServer2 becomes ready in 80 seconds.

$ kubectl get pods -n hivemr3
NAME                                   READY   STATUS    RESTARTS   AGE
hivemr3-hiveserver2-74bf7fdbdd-9dh5j   1/1     Running   0          82s
hivemr3-metastore-0                    1/1     Running   0          98s
mr3master-1161-0-7c97fb9bd9-2zzlc      1/1     Running   0          69s

The user can check the log of the DAGAppMaster Pod to make sure that it has started properly.

$ kubectl logs -f -n hivemr3 mr3master-1161-0-7c97fb9bd9-2zzlc
...
2022-07-29T14:22:07,407  INFO [DAGAppMaster-1-15] HeartbeatHandler$: Timeout check in HeartbeatHandler:Container
2022-07-29T14:22:07,489  INFO [K8sContainerLauncher-3-1] K8sContainerLauncher: Resynchronizing Pod states for appattempt_1161_0000_000000: 0

Running Beeline

The user may use any client program to connect to HiveServer2. In our example, we run Beeline inside the Hiveserver2 Pod.

$ kubectl exec -n hivemr3 -it hivemr3-hiveserver2-74bf7fdbdd-9dh5j -- /bin/bash
hive@hivemr3-hiveserver2-74bf7fdbdd-9dh5j:/opt/mr3-run/hive$ export USER=hive
hive@hivemr3-hiveserver2-74bf7fdbdd-9dh5j:/opt/mr3-run/hive$ /opt/mr3-run/hive/run-beeline.sh
...
Connecting to jdbc:hive2://hivemr3-hiveserver2-74bf7fdbdd-9dh5j:9852/;;;
Connected to: Apache Hive (version 3.1.3)
Driver: Hive JDBC (version 3.1.3)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Beeline version 3.1.3 by Apache Hive
0: jdbc:hive2://hivemr3-hiveserver2-74bf7fdbd> 

After running a few queries, ContainerWorker Pods are created.

$ kubectl get pods -n hivemr3
NAME                                   READY   STATUS    RESTARTS   AGE
hivemr3-hiveserver2-74bf7fdbdd-9dh5j   1/1     Running   0          5m42s
hivemr3-metastore-0                    1/1     Running   0          5m58s
mr3master-1161-0-7c97fb9bd9-2zzlc      1/1     Running   0          5m29s
mr3worker-2dad-1                       1/1     Running   0          77s
mr3worker-2dad-2                       1/1     Running   0          54s
mr3worker-2dad-3                       1/1     Running   0          46s
mr3worker-2dad-4                       1/1     Running   0          37s
mr3worker-2dad-5                       1/1     Running   0          37s
mr3worker-2dad-6                       1/1     Running   0          37s
mr3worker-2dad-7                       1/1     Running   0          37s
mr3worker-2dad-8                       1/1     Running   0          22s

Stopping Hive on MR3

Delete Deployment for HiveServer2.

$ kubectl -n hivemr3 delete deployment hivemr3-hiveserver2
deployment.apps "hivemr3-hiveserver2" deleted

Deleting Deployment for HiveServer2 does not automatically terminate the DAGAppMaster Pod. This is a feature, not a bug, which is due to the support of high availability in Hive on MR3. (After setting environment variable MR3_APPLICATION_ID_TIMESTAMP properly, running run-hive.sh attaches the existing DAGAppMaster Pod to the new HiveServer2 Pod.)

Delete Deployment for DAGAppMaster.

$ kubectl delete deployment -n hivemr3 mr3master-1161-0
deployment.apps "mr3master-1161-0" deleted

Deleting DAGAppMaster Pod automatically deletes all ContainerWorker Pods as well.

Delete StatefulSet for Metastore.

$ kubectl -n hivemr3 delete statefulset hivemr3-metastore
statefulset.apps "hivemr3-metastore" deleted

After a while, no Pods should be running in the namespace hivemr3. To delete all remaining resources, execute the following command:

$ kubectl -n hivemr3 delete configmap --all; kubectl -n hivemr3 delete svc --all; kubectl -n hivemr3 delete secret --all; kubectl -n hivemr3 delete serviceaccount --all;  kubectl -n hivemr3 delete role --all; kubectl -n hivemr3 delete rolebinding --all; kubectl delete clusterrole node-reader; kubectl delete clusterrolebinding hive-clusterrole-binding; kubectl -n hivemr3 delete persistentvolumeclaims workdir-pvc; kubectl delete persistentvolumes workdir-pv