Running Multiple HiveServer2 Instances
Overview
Hive on MR3 allows multiple HiveServer2 instances to share a common MR3 DAGAppMaster. As an application, we can build a Kubernetes cluster which runs multiple HiveServer2 instances each with its own Metastore instance.
All HiveServer2 instances share common services such as Ranger and Timeline Server, and send their queries to the same DAGAppMaster managing a common pool of ContainerWorker Pods. It is also easy to add a new HiveServer2 instance for another Metastore instance and to remove an existing HiveServer2 instance without affecting other HiveServer instances. In this way, we can simulate a serverless environment which achieves resource pooling and rapid elasticity through sharing ContainerWorker Pods and adding/removing HiveServer2 instances.
Adding a new HiveServer2 instance
In order to run multiple HiveServer2 instances sharing a common MR3 DAGAppMaster,
the user should keep the value of MR3_SHARED_SESSION_ID
generated by the the first HiveServer2 instance.
$ ./run-hive.sh
...
MR3_SHARED_SESSION_ID=59eb6c06-655e-48ad-b881-28516fd8d13c
To add a new HiveServer2 instance, the user should check the following list.
-
In
yaml/metastore.yaml
andyaml/metastore-service.yaml
, use a new name for StatefulSet, Service, and labels:vi yaml/metastore.yaml
metadata:
name: hivemr3-metastore2
spec:
serviceName: metastore2
selector:
matchLabels:
hivemr3_app: metastore2
template:
metadata:
name: hivemr3-metastore2vi yaml/metastore-service.yaml
metadata:
name: metastore2
spec:
selector:
hivemr3_app: metastore2 -
In
env.sh
, changeHIVE_DATABASE_HOST
to specify the address of the MySQL database for the new HiveServer2 instance. UpdateHIVE_METASTORE_HOST
to use the new Service and label.vi env.sh
HIVE_DATABASE_HOST=indigo1
HIVE_METASTORE_HOST=hivemr3-metastore2-0.metastore2.hivemr3.svc.cluster.local -
In
yaml/hive.yaml
andyaml/hiveserver2-service.yaml
, use a new name for Deployment, Service, and labels:vi yaml/hive.yaml
metadata:
name: hivemr3-hiveserver2-2
spec:
selector:
hivemr3_app: hiveserver2-2
template:
metadata:
labels:
hivemr3_app: hiveserver2-2vi yaml/hiveserver2-service.yaml
metadata:
name: hiveserver2-2
spec:
selector:
hivemr3_app: hiveserver2-2 -
In
yaml/hiveserver2-service.yaml
, use a new port for HiveServer2.vi yaml/hiveserver2-service.yaml
port: 9853 -
Update
env.sh
,run-metastore.sh
,run-hive.sh
,yaml/metastore.yaml
, andyaml/hive.yaml
to use a new name for ConfigMap:vi env.sh
CONF_DIR_CONFIGMAP=hivemr3-conf-configmap-2vi run-metastore.sh
kubectl create -n $MR3_NAMESPACE secret generic env-secret-2 --from-file=$BASE_DIR/env.shvi run-hive.sh
kubectl create -n $MR3_NAMESPACE secret generic env-secret-2 --from-file=$BASE_DIR/env.shvi yaml/metastore.yaml
spec:
template:
spec:
volumes:
- name: env-k8s-volume
secret:
secretName: env-secret-2
- name: conf-k8s-volume
configMap:
name: hivemr3-conf-configmap-2vi yaml/hive.yaml
spec:
template:
spec:
volumes:
- name: env-k8s-volume
secret:
secretName: env-secret-2
- name: conf-k8s-volume
configMap:
name: hivemr3-conf-configmap-2 -
If necessary, update
conf/ranger-hive-security.xml
to use a different Hive service for Ranger.vi conf/ranger-hive-security.xml
<property>
<name>ranger.plugin.hive.service.name</name>
<value>INDIGO_hive2</value>
</property>
Now the user can execute the script run-hive.sh
to start a new HiveServer2 instance.
Set MR3_SHARED_SESSION_ID
to the value generated by the first HiveServer2 instance.
export MR3_SHARED_SESSION_ID=59eb6c06-655e-48ad-b881-28516fd8d13c
./run-hive.sh