This page shows how to operate Hive on MR3 on Amazon EKS with autoscaling. By following the instruction, the user will learn:

  1. how to create and configure an EKS cluster for running Hive on MR3 with autoscaling
  2. how to run Hive on MR3 in the EKS cluster

This scenario has the following prerequisites:

  1. The user can create an EKS cluster with the command eksctl.
  2. The user can create IAM policies and EFS.
  3. The user can update IAM policies and configure LoadBalancers.
  4. A MySQL server for Metastore is running and accessible from the EKS cluster.
  5. The MySQL server for Metastore is already populated with Hive databases.
  6. The user has access to an S3 bucket storing the warehouse and all S3 buckets containing datasets.
  7. The user can download a MySQL connector.
  8. The user can run Beeline to connect to HiveServer2 running at a particular address.

Although this page is self-contained, we recommend the user to read the page on Autoscaling and the guide on running Hive on MR3 on Amazon EKS before proceeding. Since Amazon EKS is a particular case of Kubernetes, it also helps to try running Hive on MR3 on Minikube as explained in the page On Minikube with a Pre-built Docker Image.

We use a pre-built Docker image mr3project/hivemr3:1.0 available at DockerHub, so the user does not have to build a new Docker image. This scenario should take 1 hour to 2 hours to complete, including the time for creating an EKS cluster. This page has been tested with MR3 release 1.0.

Installation

Download an MR3 release containing the executable scripts.

$ wget https://github.com/mr3project/mr3-release/releases/download/v1.0/mr3-1.0-run.tar.gz
$ gunzip -c mr3-1.0-run.tar.gz | tar xvf -;
$ cd mr3-1.0-run/kubernetes

Restore those files created for Amazon EKS.

$ mv -f env.sh.eks env.sh
$ mv -f conf/core-site.xml.eks conf/core-site.xml
$ mv -f conf/mr3-site.xml.eks conf/mr3-site.xml
$ mv -f conf/hive-site.xml.eks conf/hive-site.xml
$ mv -f yaml/metastore.yaml.eks yaml/metastore.yaml
$ mv -f yaml/hive.yaml.eks yaml/hive.yaml
$ mv -f yaml/ranger.yaml.eks yaml/ranger.yaml
$ mv -f yaml/ats.yaml.eks yaml/ats.yaml

IAM policy for autoscaling

Create an IAM policy EKSAutoScalingWorkerPolicy.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": [
                "autoscaling:DescribeAutoScalingGroups",
                "autoscaling:DescribeAutoScalingInstances",
                "autoscaling:DescribeLaunchConfigurations",
                "autoscaling:DescribeTags",
                "autoscaling:SetDesiredCapacity",
                "autoscaling:TerminateInstanceInAutoScalingGroup",
                "ec2:DescribeLaunchTemplateVersions"
            ],
            "Resource": "*",
            "Effect": "Allow"
        }
    ]
}

Configuring the EKS cluster (eks/cluster.yaml)

Open eks/cluster.yaml. By default, we create an EKS cluster in the region ap-northeast-2. Change the region if necessary.

metadata:
  region: ap-northeast-2

We create an EKS cluster with two node groups: mr3-master and mr3-worker. The mr3-master node group is intended for HiveServer2, DAGAppMaster, and Metastore Pods, and uses a single on-demand instance of type m5.xlarge for the master node. The mr3-worker node group is intended for ContainerWorker Pods, and uses up to three spot instances of type m5d.xlarge or m5d.2xlarge for worker nodes. Note that worker nodes have instance storage.

nodeGroups:
  - name: mr3-master
    instanceType: m5.xlarge
    desiredCapacity: 1
  - name: mr3-worker
    desiredCapacity: 0
    minSize: 0
    maxSize: 3
    instancesDistribution:
      instanceTypes: ["m5d.xlarge", "m5d.2xlarge"]

The following diagram shows an example of the EKS cluster after launch: eks.autoscaling.example

In the preBootstrapCommands field of the node group mr3-master, replace your.address with the address for downloading a MySQL connector jar file. In our example, we use mysql-connector-java-8.0.12.jar. In the iam/attachPolicyARNs field of the node group mr3-worker, use the ARN (Amazon Resource Name) of the IAM policy created in the previous step.

nodeGroups:
  - name: mr3-master
    preBootstrapCommands:
      - "wget http://your.address/mysql-connector-java-8.0.12.jar"
      - "mkdir -p /home/ec2-user/lib"
      - "mv mysql-connector-java-8.0.12.jar /home/ec2-user/lib"
  - name: mr3-worker
    iam:
      attachPolicyARNs:
        - arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy
        - arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy
        - arn:aws:iam::111111111111:policy/EKSAutoScalingWorkerPolicy

Creating the EKS cluster

Create an EKS cluster by executing the command eksctl. It can take 15 minutes or longer.

$ eksctl create cluster -f eks/cluster.yaml 
[ℹ]  eksctl version 0.10.2
[ℹ]  using region ap-northeast-2
[ℹ]  setting availability zones to [ap-northeast-2b ap-northeast-2c ap-northeast-2a]
...
[✔]  EKS cluster "hive-mr3" in "ap-northeast-2" region is ready

By default, the command eksctl uses all Availability Zones (AZs) from the region specified by the field metadata/region in eks/cluster.yaml. The user should be aware that Amazon charges for data transfer between different AZs ($0.01/GB to $0.02/GB) and intermediate data exchanged by ContainerWorkers can cross the AZ boundary.

The user can find that two Auto Scaling groups are created.

auto.scaling.group.eks

In our example, the mr3-master node group starts with a single master node whereas the mr3-worker node group starts with no node and can attach up to three nodes.

The user can verify that only the master node is available in the EKS cluster.

$ kubectl get nodes
NAME                                              STATUS   ROLES    AGE   VERSION
ip-192-168-51-1.ap-northeast-2.compute.internal   Ready    <none>   78s   v1.14.7-eks-1861c5

The user can get the public IP address of the master node.

$ kubectl describe node ip-192-168-51-1.ap-northeast-2.compute.internal
...
Addresses:
  InternalIP:   192.168.51.1
  ExternalIP:   54.180.106.76
  Hostname:     ip-192-168-51-1.ap-northeast-2.compute.internal
  InternalDNS:  ip-192-168-51-1.ap-northeast-2.compute.internal
  ExternalDNS:  ec2-54-180-106-76.ap-northeast-2.compute.amazonaws.com

Configuring Kubernetes Autoscaler (eks/cluster-autoscaler-autodiscover.yaml)

Open eks/cluster-autoscaler-autodiscover.yaml and change the configuration for autoscaling if necessary. By default, the Kubernetes Autoscaler removes nodes that stay idle for 1 minute (as specified by --scale-down-unneeded-time).

spec:
  template:
    spec:
      containers:
          command:
            - --scale-down-delay-after-add=5m
            - --scale-down-unneeded-time=1m

Start the Kubernetes Autoscaler.

$ kubectl apply -f eks/cluster-autoscaler-autodiscover.yaml 
serviceaccount/cluster-autoscaler created
clusterrole.rbac.authorization.k8s.io/cluster-autoscaler created
role.rbac.authorization.k8s.io/cluster-autoscaler created
clusterrolebinding.rbac.authorization.k8s.io/cluster-autoscaler created
rolebinding.rbac.authorization.k8s.io/cluster-autoscaler created
deployment.apps/cluster-autoscaler created

The user can check that the Kubernetes Autoscaler has started properly.

$ kubectl logs -f deployment/cluster-autoscaler -n kube-system
...
I0203 13:45:25.523246       1 utils.go:543] Skipping ip-192-168-51-1.ap-northeast-2.compute.internal - no node group config
I0203 13:45:25.523286       1 static_autoscaler.go:393] Scale down status: unneededOnly=true lastScaleUpTime=2020-02-03 13:45:05.494189339 +0000 UTC m=+2.412521091 lastScaleDownDeleteTime=2020-02-03 13:45:05.494189428 +0000 UTC m=+2.412521183 lastScaleDownFailTime=2020-02-03 13:45:05.494189513 +0000 UTC m=+2.412521267 scaleDownForbidden=false isDeleteInProgress=false

Creating and mounting EFS (efs/manifest.yaml)

The user can create EFS on the AWS Console. When creating EFS, choose the VPC of the EKS cluster. Make sure that a mount target is created for each Availability Zone.

Identify two security groups:

  1. the security group for the EC2 instances constituting the EKS cluster;
  2. the security group associated with the EFS mount targets. Add a rule to allow inbound access using NFS from the first security group so that the EKS cluster can access EFS, as shown below: eks.security.group.nfs Here sg-096f3c3dff95ad6ae is the first security group.

Get the file system ID of EFS (e.g., fs-138ff772) and the address (e.g., fs-138ff772.efs.ap-northeast-2.amazonaws.com). Open efs/manifest.yaml and update the following fields.

$ vi efs/manifest.yaml

data:
  file.system.id: fs-138ff772
  aws.region: ap-northeast-2
spec:
  template:
    spec:
      volumes:
      - name: pv-volume
        nfs:
          server: fs-138ff772.efs.ap-northeast-2.amazonaws.com

Execute the script mount-efs.sh to create a PersistentVolume.

$ ./mount-efs.sh 
namespace/hivemr3 created
serviceaccount/efs-provisioner created
clusterrole.rbac.authorization.k8s.io/efs-provisioner-runner created
clusterrolebinding.rbac.authorization.k8s.io/run-efs-provisioner created
role.rbac.authorization.k8s.io/leader-locking-efs-provisioner created
rolebinding.rbac.authorization.k8s.io/leader-locking-efs-provisioner created
configmap/efs-provisioner created
deployment.extensions/efs-provisioner created
storageclass.storage.k8s.io/aws-efs created
persistentvolumeclaim/workdir-pvc created

The user can find a new StorageClass aws-efs.

$ kubectl get sc
NAME            PROVISIONER             AGE
aws-efs         example.com/aws-efs     11s
gp2 (default)   kubernetes.io/aws-ebs   21m

Access to the MySQL server for Metastore

Check if the MySQL server for Metastore is accessible from the master node. If the MySQL server is running on Amazon AWS, the user may have to update its security group or VPC configuration.

Configuring Metastore

Open env.sh and set the environment variable HIVE_DATABASE_HOST to the address of the MySQL server for Metastore. In our example, the MySQL server is running at the IP address 15.165.19.52. Set the environment variable HIVE_WAREHOUSE_DIR to the S3 bucket storing the warehouse. Note that we should use prefix s3a, not s3.

HIVE_DATABASE_HOST=15.165.19.52

HIVE_WAREHOUSE_DIR=s3a://hivemr3-warehouse-dir/warehouse

If the user sets the environment variable HIVE_DATABASE_HOST to the host name of the MySQL server, the user may have to update: 1) the spec/template/spec/hostAliases field in yaml/metastore.yaml and yaml/hive.yaml, and 2) the configuration key mr3.k8s.host.aliases in conf/mr3-site.xml.

Checking configurations for MR3

For conf/core-site.xml and conf/mr3-site.xml, the user should use the default values for all configuration keys. There are two configuration keys that the user may want to update.

  • mr3.k8s.pod.worker.hostpaths in conf/mr3-site.xml is set to /ephemeral1 because the instance types m5d.xlarge and m5d.2xlarge for worker nodes are equipped with a single local disk mounted on the directory /ephemeral1. If the user uses different instance types with multiple local disks, the preBootstrapCommands field of the node group mr3-worker should be expanded to mount all local disks and the configuration key mr3.k8s.pod.worker.hostpaths should include additional directories.

    nodeGroups:
      - name: mr3-worker
        preBootstrapCommands:
          - "mkfs -t ext4 /dev/nvme1n1"
          - "mkdir -p /ephemeral1"
          - "mount /dev/nvme1n1 /ephemeral1"
          - "chown ec2-user:ec2-user /ephemeral1"
    
  • Since the Kubernetes Autoscaler is configured to remove nodes that remain idle for 1 minute for fast scale-in, mr3.auto.scale.in.grace.period.secs in conf/mr3-site.xml is set to 90 seconds (60 seconds of idle time and extra 30 seconds to account for delays). If the user wants to increase the value of --scale-down-unneeded-time in eks/cluster-autoscaler-autodiscover.yaml, the configuration key mr3.auto.scale.in.grace.period.secs should be adjusted accordingly.

    spec:
      template:
        spec:
          containers:
              command:
                - --scale-down-unneeded-time=1m
    

Configuring Metastore

Open conf/hive-site.xml and update configurations for Metastore as necessary. Below we list some of configuration keys that the user should check. The two configuration keys javax.jdo.option.ConnectionUserName and javax.jdo.option.ConnectionPassword should match the user name and password of the MySQL server for Metastore.

<property>
  <name>hive.metastore.db.type</name>
  <value>MYSQL</value>
</property>

<property>
  <name>javax.jdo.option.ConnectionDriverName</name>
  <value>com.mysql.cj.jdbc.Driver</value>
</property>

<property>
  <name>javax.jdo.option.ConnectionUserName</name>
  <value>root</value>
</property>

<property>
  <name>javax.jdo.option.ConnectionPassword</name>
  <value>passwd</value>
</property>

<property>
  <name>hive.security.metastore.authenticator.manager</name>
  <value>org.apache.hadoop.hive.ql.security.HadoopDefaultMetastoreAuthenticator</value>
</property>

<property>
  <name>hive.security.metastore.authorization.manager</name>
  <value>org.apache.hadoop.hive.ql.security.authorization.DefaultHiveMetastoreAuthorizationProvider</value>
</property>

<property>
  <name>hive.security.authenticator.manager</name>
  <value>org.apache.hadoop.hive.ql.security.ProxyUserAuthenticator</value>
</property>

By default, Metastore does not initialize schema. In order to initialize schema when starting Metastore, update kubernetes/yaml/yaml/metastore.yaml as follows:

spec:
  template:
    spec:
      containers:
        args: ["start", "--kubernetes", "--init-schema"]

Docker image

By default, we use a pre-built Docker image mr3project/hivemr3:1.0. If the user wants to use a different Docker image, check the environment variable DOCKER_HIVE_IMG in env.sh and the field spec/template/spec/containers/image in yaml/metastore.yaml and yaml/hive.yaml.

DOCKER_HIVE_IMG=${DOCKER_HIVE_IMG:-mr3project/hive3:1.0}
spec:
  template:
    spec:
      containers:
      - image: mr3project/hive3:1.0

Accessing S3 buckets

Find the IAM (Identity and Access Management) roles for the mr3-master and mr3-worker node groups (which typically look like eksctl-hive-mr3-nodegroup-mr3-mas-NodeInstanceRole-448MRIYIQ3F8 and eksctl-hive-mr3-nodegroup-mr3-wor-NodeInstanceRole-E19NHT8X0UJ7). For both IAM roles, add the following inline policy or its variant so that every Pod can access S3 buckets storing the warehouse and containing datasets. Adjust the Action field to restrict the set of operations permitted to Pods.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "s3:*"
            ],
            "Resource": [
                "arn:aws:s3:::hivemr3-warehouse-dir",
                "arn:aws:s3:::hivemr3-warehouse-dir/*",
                "arn:aws:s3:::hivemr3-partitioned-10-orc",
                "arn:aws:s3:::hivemr3-partitioned-10-orc/*"
            ]
        }
    ]
}

Running Metastore and HiveServer2

The user can start Metastore by executing the script run-metastore.sh. The user can also start HiveServer2 by executing the script run-hive.sh.

$ ./run-metastore.sh
Error from server (AlreadyExists): namespaces "hivemr3" already exists
...
configmap/client-am-config created
statefulset.apps/hivemr3-metastore created
service/metastore created
$ ./run-hive.sh
Error from server (AlreadyExists): namespaces "hivemr3" already exists
assume that mount-efs.sh has been executed
...
replicationcontroller/hivemr3-hiveserver2 created
service/hiveserver2 created

HiveServer2 creates a DAGAppMaster Pod. Note, however, that no ContainerWorkers Pods are created until queries are submitted.

$ kubectl get pods -n hivemr3
NAME                             READY   STATUS    RESTARTS   AGE
efs-provisioner-77bf56cc-nw799   1/1     Running   0          11m
hivemr3-hiveserver2-8fmt9        1/1     Running   0          20s
hivemr3-metastore-0              1/1     Running   0          51s
mr3master-3746-0                 1/1     Running   0          2s

The user can check the log of the DAGAppMaster Pod to make sure that it has started properly.

$ kubectl logs -f -n hivemr3 mr3master-3746-0
...
2020-02-03T14:05:23,045  INFO [DAGAppMaster-1-13] HeartbeatHandler$: Timeout check in HeartbeatHandler:Container
2020-02-03T14:05:23,123  INFO [K8sContainerLauncher-3-1] K8sContainerLauncher: Resynchronizing Pod states for appattempt_13746_0000_000000: 0

Configuring the LoadBalancer

Executing the script kubernetes/run-hive.sh creates a new LoadBalancer for HiveServer2. Get the security group associated with the LoadBalancer. If necessary, edit the inbound rule in order to restrict the source IP addresses (by changing the source from 0.0.0.0/0 to (IP address)/32.

eks.load.balancer.source

The LoadBalancer disconnects Beeline showing no activity for the idle timeout period, which is 60 seconds by default. The user may want to increase the idle timeout period, e.g., to 600 seconds. Otherwise Beeline loses the connection to HiveServer2 even after a brief period of inactivity.

Running Beeline

To run Beeline, get the LoadBalancer Ingress of the Service hiveserver2.

$ kubectl describe service -n hivemr3 hiveserver2
Name:                     hiveserver2
Namespace:                hivemr3
Labels:                   <none>
Annotations:              <none>
Selector:                 hivemr3_app=hiveserver2
Type:                     LoadBalancer
IP:                       10.100.50.12
External IPs:             10.1.91.41
LoadBalancer Ingress:     a49f76720137a11ea8d310a2a674d2f9-1493289788.ap-northeast-2.elb.amazonaws.com
...

Get the IP address of the LoadBalancer Ingress.

$ nslookup a49f76720137a11ea8d310a2a674d2f9-1493289788.ap-northeast-2.elb.amazonaws.com
...
Non-authoritative answer:
Name: a49f76720137a11ea8d310a2a674d2f9-1493289788.ap-northeast-2.elb.amazonaws.com
Address: 52.78.104.197
Name: a49f76720137a11ea8d310a2a674d2f9-1493289788.ap-northeast-2.elb.amazonaws.com
Address: 13.124.95.129

In this example, the user can use 52.78.104.197 or 13.124.95.129 as the IP address of HiveServer2 when running Beeline. This is because Beeline connects first to the LoadBalancer, not directly to HiveServer2.

Here is an example of running Beeline using the address of 52.78.104.197.

gla@gold7:~/mr3-run/hive$ ./run-beeline.sh --tpcds --hivesrc3
...
Connecting to jdbc:hive2://52.78.104.197:9852/;;
Connected to: Apache Hive (version 3.1.2)
Driver: Hive JDBC (version 3.1.2)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Beeline version 3.1.2 by Apache Hive
0: jdbc:hive2://52.78.104.197:9852/> 

After running a few queries, new worker nodes are attached and ContainerWorker Pods are created. In our example, the EKS cluster ends up with three worker nodes, consisting of one m5d.2xlarge instance and two m5d.xlarge instances, all running in different AZs.

eks.auto.scaling.worker.node.final

The last ContainerWorker Pod stays in the state of Pending because the number of worker nodes has reached its maximum (3 in our example) and no more worker nodes can be attached.

NAME                               READY   STATUS    RESTARTS   AGE
efs-provisioner-76f55c5687-vspvj   1/1     Running   0          40m
hivemr3-hiveserver2-dsxh9          1/1     Running   0          22m
hivemr3-metastore-0                1/1     Running   0          22m
mr3master-6986-0                   1/1     Running   0          22m
mr3worker-1b92-1                   1/1     Running   0          21m
mr3worker-1b92-2                   1/1     Running   0          18m
mr3worker-1b92-3                   1/1     Running   0          6m13s
mr3worker-1b92-4                   1/1     Running   0          4m3s
mr3worker-1b92-5                   0/1     Pending   0          2m13s

Here is the progress of scale-out operations:

eks.auto.scaling.progress.scale.out

Here is the progress of scale-in operations:

eks.auto.scaling.progress.scale.in

Note that the EKS cluster does not remove all worker nodes because the configuration key mr3.auto.scale.in.min.hosts in mr3-site.xml is set to 1, which means that no scale-in operation is performed if the number of worker nodes is 1.

The user can check the progress of autoscaling from the log of the DAGAppMaster Pod.

$ kubectl log -n hivemr3 mr3master-6986-0 -f | grep -e Scale -e Scaling -e average 

Deleting the EKS cluster

Because of the additional components configured manually, it take a few extra steps to delete the EKS cluster. In order to delete the EKS cluster (created with eksctl), proceed in the following order.

  1. Remove the new inbound rule (using NFS) in the security group for EFS.
  2. Delete EFS on the AWS console.
  3. Remove inline policies from the IAM roles for the mr3-master and mr3-worker node groups.
  4. Execute the command eksctl delete cluster -f cluster.yaml.
$ eksctl delete cluster -f eks/cluster.yaml 

If the last command fails, the user should delete the EKS cluster manually. Proceed in the following order on the AWS console.

  1. Delete security groups manually.
  2. Delete the NAT gateway created for the EKS cluster, delete the VPC, and then delete the Elastic IP address.
  3. Delete the LoadBalancer.
  4. Delete IAM roles.
  5. Delete CloudFormations.