Hive on MR3 allows multiple HiveServer2 instances to share a common MR3 DAGAppMaster. As an application, we can build a Kubernetes cluster which runs multiple HiveServer2 instances each with its own Metastore instance, as shown in the following diagram:
All HiveServer2 instances share common services such as Timeline Server, Ranger, KDC, and KMS, and send their queries to the same DAGAppMaster managing a common pool of ContainerWorker Pods. It is also easy to add a new HiveServer2 instance for another Metastore instance and to remove an existing HiveServer2 instance without affecting other HiveServer instances. In this way, we can simulate a serverless environment which achieves resource pooling and rapid elasticity through sharing ContainerWorker Pods and adding/removing HiveServer2 instances.
Adding a new HiveServer2 instance
In order to run multiple HiveServer2 instances sharing a common MR3 DAGAppMaster,
the user should keep the value of MR3_SHARED_SESSION_ID
generated by the the first HiveServer2 instance.
$ kubernetes/run-hive.sh
...
MR3_SHARED_SESSION_ID=59eb6c06-655e-48ad-b881-28516fd8d13c
...
To add a new HiveServer2 instance, the user should check the following list:
-
In
kubernetes/yaml/metastore.yaml
andkubernetes/yaml/metastore-service.yaml
, use a new name for StatefulSet, Service, and labels:$ vi kubernetes/yaml/metastore.yaml metadata: name: hivemr3-metastore2 spec: serviceName: metastore2 selector: matchLabels: hivemr3_app: metastore2 template: metadata: name: hivemr3-metastore2
$ vi kubernetes/yaml/metastore-service.yaml metadata: name: metastore2 spec: selector: hivemr3_app: metastore2
-
In
kubernetes/env.sh
, changeHIVE_DATABASE_HOST
to specify the address of the MySQL database for the new HiveServer2 instance. UpdateHIVE_METASTORE_HOST
to use the new Service and label.$ vi kubernetes/env.sh HIVE_DATABASE_HOST=indigo1 HIVE_METASTORE_HOST=hivemr3-metastore2-0.metastore2.hivemr3.svc.cluster.local
-
In
kubernetes/yaml/hive.yaml
andkubernetes/yaml/hiveserver2-service.yaml
, use a new name for Deployment, Service, and labels:$ vi kubernetes/yaml/hive.yaml metadata: name: hivemr3-hiveserver2-2 spec: selector: hivemr3_app: hiveserver2-2 template: metadata: labels: hivemr3_app: hiveserver2-2
$ vi kubernetes/yaml/hiveserver2-service.yaml metadata: name: hiveserver2-2 spec: selector: hivemr3_app: hiveserver2-2
-
In
kubernetes/yaml/hiveserver2-service.yaml
, use a new port for HiveServer2.$ vi kubernetes/yaml/hiveserver2-service.yaml port: 9853
-
Update
kubernetes/env.sh
,kubernetes/run-metastore.sh
,kubernetes/run-hive.sh
,kubernetes/metastore.yaml
, andkubernetes/hive.yaml
to use a new name for ConfigMap:$ vi kubernetes/env.sh CONF_DIR_CONFIGMAP=hivemr3-conf-configmap-2
$ vi kubernetes/run-metastore.sh kubectl create -n $MR3_NAMESPACE secret generic env-secret-2 --from-file=$BASE_DIR/env.sh
$ vi kubernetes/run-hive.sh kubectl create -n $MR3_NAMESPACE secret generic env-secret-2 --from-file=$BASE_DIR/env.sh
$ vi kubernetes/metastore.yaml spec: template: spec: volumes: - name: env-k8s-volume secret: secretName: env-secret-2 - name: conf-k8s-volume configMap: name: hivemr3-conf-configmap-2
$ vi kubernetes/hive.yaml spec: template: spec: volumes: - name: env-k8s-volume secret: secretName: env-secret-2 - name: conf-k8s-volume configMap: name: hivemr3-conf-configmap-2
-
If necessary, update
kubernetes/conf/ranger-hive-security.xml
to use a different Hive service for Ranger.$ vi kubernetes/conf/ranger-hive-security.xml <property> <name>ranger.plugin.hive.service.name</name> <value>INDIGO_hive2</value> </property>
Now the user can execute the script kubernetes/run-hive.sh
to start a new HiveServer2 instance.
Be sure to set MR3_SHARED_SESSION_ID
to the value generated by the first HiveServer2 instance.
$ export MR3_SHARED_SESSION_ID=59eb6c06-655e-48ad-b881-28516fd8d13c
$ kubernetes/run-hive.sh