Running Multiple HiveServer2 Instances
Overview
Hive on MR3 allows multiple HiveServer2 instances to share a common MR3 DAGAppMaster. As an application, we can build a Kubernetes cluster which runs multiple HiveServer2 instances each with its own Metastore instance.
All HiveServer2 instances share common services such as Ranger, and send their queries to the same DAGAppMaster managing a common pool of ContainerWorker Pods. It is also easy to add a new HiveServer2 instance for another Metastore instance and to remove an existing HiveServer2 instance without affecting other HiveServer instances. In this way, we can simulate a serverless environment which achieves resource pooling and rapid elasticity through sharing ContainerWorker Pods and adding/removing HiveServer2 instances.
Adding a new HiveServer2 instance
In order to run multiple HiveServer2 instances sharing a common MR3 DAGAppMaster,
the user should keep the value of MR3_SHARED_SESSION_ID generated by the the first HiveServer2 instance.
$ ./run-hive.sh
...
MR3_SHARED_SESSION_ID=59eb6c06-655e-48ad-b881-28516fd8d13c
To add a new HiveServer2 instance, the user should check the following list.
-
In
yaml/metastore.yamlandyaml/metastore-service.yaml, use a new name for StatefulSet, Service, and labels:vi yaml/metastore.yaml
metadata:
name: hivemr3-metastore2
spec:
serviceName: metastore2
selector:
matchLabels:
hivemr3_app: metastore2
template:
metadata:
name: hivemr3-metastore2vi yaml/metastore-service.yaml
metadata:
name: metastore2
spec:
selector:
hivemr3_app: metastore2 -
In
env.sh, changeHIVE_DATABASE_HOSTto specify the address of the MySQL database for the new HiveServer2 instance. UpdateHIVE_METASTORE_HOSTto use the new Service and label.vi env.sh
HIVE_DATABASE_HOST=indigo1
HIVE_METASTORE_HOST=hivemr3-metastore2-0.metastore2.hivemr3.svc.cluster.local -
In
yaml/hive.yamlandyaml/hiveserver2-service.yaml, use a new name for Deployment, Service, and labels:vi yaml/hive.yaml
metadata:
name: hivemr3-hiveserver2-2
spec:
selector:
hivemr3_app: hiveserver2-2
template:
metadata:
labels:
hivemr3_app: hiveserver2-2vi yaml/hiveserver2-service.yaml
metadata:
name: hiveserver2-2
spec:
selector:
hivemr3_app: hiveserver2-2 -
In
yaml/hiveserver2-service.yaml, use a new port for HiveServer2.vi yaml/hiveserver2-service.yaml
port: 9853 -
Update
env.sh,run-metastore.sh,run-hive.sh,yaml/metastore.yaml, andyaml/hive.yamlto use a new name for ConfigMap:vi env.sh
CONF_DIR_CONFIGMAP=hivemr3-conf-configmap-2vi run-metastore.sh
kubectl create -n $MR3_NAMESPACE secret generic env-secret-2 --from-file=$BASE_DIR/env.shvi run-hive.sh
kubectl create -n $MR3_NAMESPACE secret generic env-secret-2 --from-file=$BASE_DIR/env.shvi yaml/metastore.yaml
spec:
template:
spec:
volumes:
- name: env-k8s-volume
secret:
secretName: env-secret-2
- name: conf-k8s-volume
configMap:
name: hivemr3-conf-configmap-2vi yaml/hive.yaml
spec:
template:
spec:
volumes:
- name: env-k8s-volume
secret:
secretName: env-secret-2
- name: conf-k8s-volume
configMap:
name: hivemr3-conf-configmap-2 -
If necessary, update
conf/ranger-hive-security.xmlto use a different Hive service for Ranger.vi conf/ranger-hive-security.xml
<property>
<name>ranger.plugin.hive.service.name</name>
<value>INDIGO_hive2</value>
</property>
Now the user can execute the script run-hive.sh to start a new HiveServer2 instance.
Set MR3_SHARED_SESSION_ID to the value generated by the first HiveServer2 instance.
export MR3_SHARED_SESSION_ID=59eb6c06-655e-48ad-b881-28516fd8d13c
./run-hive.sh