A Timeline Server in a Kubernetes cluster enables DAGAppMaster to report the status of every DAG. Then the user can check the progress of running DAGs and the history of completed DAGs by running MR3-UI on the web browser:

hive.k8s.timeline

Building a Docker image for Timeline Server

To build a Docker images for Timeline Server, download the source code of Timeline Server from Timeline Server for MR3 on Kubernetes (using the master-mr3 branch), which is a variant of Timeline Server included in the Hadoop 2.7.7 distribution. Then set the following environment variable in env.sh to specify the directory of the source code:

# Timeline-MR3 source directory
ATS_MR3_SRC=~/timeline-mr3

To compile Timeline Server, install protoc for Protobuf 2.5 and execute ats/compile-ats.sh with the following option:

<mvn option>              # Add a Maven option; may be repeated at the end.

Then collect all necessary files in the directory kubernetes/ats by executing the script build-k8s-ats.sh.

$ ./build-k8s-ats.sh 
$ du -hs kubernetes/ats/
78M kubernetes/ats/
$ ls kubernetes/ats/
Dockerfile           hadoop/              timeline-service.sh  
$ ls kubernetes/ats/hadoop/apache-hadoop/
bin  etc  libexec  share

Similarly to building a Docker image for running Hive on MR3, the user should set two environment variables in kubernetes/env.sh:

DOCKER_ATS_IMG=10.1.91.17:5000/ats-2.7.7:latest
  • DOCKER_ATS_IMG specifies the full name of the Docker image (including a tag) for running Timeline Server which may include the address of a running Docker server.

The last step is to build a Docker image from Dockerfile in the directory kubernetes/ats by executing kubernetes/build-ats.sh. The script builds a Docker image (which contains everything for running Timeline Server) and registers it to the Docker server specified in kubernetes/env.sh. If successful, the user can pull the Docker image on another node:

$ docker pull 10.1.91.17:5000/ats-2.7.7
Using default tag: latest
Trying to pull repository 10.1.91.17:5000/ats-2.7.7 ... 
latest: Pulling from 10.1.91.17:5000/ats-2.7.7
...
2f0e7389970d: Pull complete 
Digest: sha256:9d34e72b6841bb1ee3818c06a556d32799a5837cea31d41923384c6a2f4690a2
Status: Downloaded newer image for 10.1.91.17:5000/ats-2.7.7:latest

Configuring the Pod for Timeline Server

The following files specify how to configure Kubernetes objects for Timeline Server:

└── kubernetes
    ├── env.sh
    ├── ats-key
    └── yaml
        ├── ats-service.yaml
        ├── ats.yaml
        ├── namespace.yaml
        ├── workdir-pvc-ats.yaml
        └── workdir-pv-ats.yaml

We assume that Timeline Server belongs to the same namespace as HiveServer2, and reuse namespace.yaml. Timeline Server uses workdir-pvc-ats.yaml and workdir-pv-ats.yaml which can be configured similarly to workdir-pvc.yaml and workdir-pv.yaml.

The user should set the following environment variable in kubernetes/env.sh.

CREATE_ATS_SECRET=true
  • CREATE_ATS_SECRET specifies whether or not to create a Secret from keytab files in the directory kubernetes/ats-key. It should be set to true if Kerberos is used for authentication.

ats-service.yaml

This file creates a Service for exposing Timeline Server to the outside of the Kubernetes cluster. The user should specify a public IP address with a valid host name and two port numbers for Timeline Server so that both clients from the outside and DAGAppMaster from the inside can connect to it using the host name.

  ports:
  - name: ats-http
    protocol: TCP
    port: 8188
    targetPort: 8188
  - name: ats-https
    protocol: TCP
    port: 8190
    targetPort: 8190
  externalIPs:
  - 10.1.91.41

The sample file in the MR3 release uses 10.1.91.41:8188 as the HTTP address and 10.1.91.41:8190 as the HTTPS address of Timeline Server. The user should make sure that the IP address exists with a valid host name and is not already taken.

ats.yaml

This file creates a Pod for running Timeline Server. Most of the sections in it work okay with default settings, except for the spec/containers section which should be update according to Kubernetes cluster settings.

  • The image field should match the Docker image specified by DOCKER_ATS_IMG in kubernetes/env.sh.
  • The resources/requests and resources/limits specify the resources to be allocated to a Timeline Server Pod.
  • The ports/containerPort fields should match the port numbers specified in ats-service.yaml.
spec:
  containers:
  - image: 10.1.91.17:5000/ats-2.7.7
    resources:
      requests:
        cpu: 1
        memory: 4Gi
      limits:
        cpu: 1
        memory: 4Gi
    ports:
    - containerPort: 8188
      protocol: TCP
    - containerPort: 8190
      protocol: TCP

Configuring Timeline Server

The following files specify how to configure Timeline Server:

└── kubernetes
    ├── env.sh
    └── ats-conf
        ├── core-site.xml
        ├── krb5.conf
        ├── log4j.properties
        ├── ssl-server.xml
        └── yarn-site.xml

The following code shows part of kubernetes/env.sh for specifying the heap size (in MB) for Timeline Server:

ATS_HEAPSIZE=2048

The configuration files in the directory kubernetes/ats-conf should work okay on a typical Kubernetes cluster. Below we describe those sections that are specific to each Kubernetes cluster.

yarn-site.xml

The user should update the following configurations if the port numbers specified in ats-service.yaml are different from default values of 8188 and 8190:

  • yarn.timeline-service.webapp.address for the HTTP address
  • yarn.timeline-service.webapp.https.address for the HTTPS address

The default configuration in kubernetes/ats-conf/yarn-site.xml assumes that a service keytab file spnego.service.keytab for service principal HTTP/indigo20@RED is ready in the directory kubernetes/ats-key.

<property>
  <name>yarn.timeline-service.http-authentication.kerberos.principal</name>
  <value>HTTP/indigo20@RED</value>
</property>

<property>
  <name>yarn.timeline-service.http-authentication.kerberos.keytab</name>
  <value>/opt/mr3-run/ats/key/spnego.service.keytab</value>
</property>

The user should update these two configuration keys in order to use a different service keytab file or a different service principal. In particular, the service principal should use the host name for the Service for Timeline Server (e.g., indigo20 in HTTP/indigo20@RED which corresponds to externalIPs in ats-service.yaml).

krb5.conf

This file should contains the information for Kerberos configuration. Usually it suffices to use a copy of kubernetes/conf/krb5.conf.

Running Timeline Server

Before running Timeline Server, the user should make sure that HiveServer2 is not running. In order to run Timeline Server, the user can execute the script kubernetes/run-ats.sh:

$ kubernetes/run-ats.sh 
...
ATS_SECRET_KEY=22f767f8-7c56-421d-ac36-f2cf2392c1ba
configmap/client-ats-config created
secret/hivemr3-ats-secret created
pod/hivemr3-ats created
service/timelineserver created

The user should keep the string set in the environment variable ATS_SECRET_KEY which is automatically generated unless the user explicitly sets it before executing kubernetes/run-ats.sh. The string is later passed to DAGAppMaster so that both Timeline Server and DAGAppMaster can share a common secret. (Otherwise DAGAppMaster would not be able to communicate with Timeline Server.)

Executing the script kubernetes/run-ats.sh starts a Timeline Server Pod in a moment:

$ kubectl get -n hivemr3 pods
NAME          READY   STATUS    RESTARTS   AGE
hivemr3-ats   1/1     Running   0          52s

Running HiveServer2 after starting Timeline Server

Now that Timeline Server is running, the user should set the environment variable ATS_SECRET_KEY to the string reported in the previous step. Then executing the script kubernetes/run-hive.sh start a HiveServer2 Pod which in turn creates a DAGAppMaster Pod ready to communicate with Timeline Server.

$ export ATS_SECRET_KEY=22f767f8-7c56-421d-ac36-f2cf2392c1ba
$ kubernetes/run-hive.sh 
...
CLIENT_TO_AM_TOKEN_KEY=76a62e5a-ace3-44ee-9252-01baafbe8dd6
MR3_APPLICATION_ID_TIMESTAMP=17893
MR3_SHARED_SESSION_ID=0f52f3b5-563a-448d-9848-3f54c27f53b6
ATS_SECRET_KEY=22f767f8-7c56-421d-ac36-f2cf2392c1ba
configmap/client-am-config created
replicationcontroller/hivemr3-hiveserver2 created
service/hiveserver2 created
$ kubectl get -n hivemr3 pods
NAME                        READY   STATUS    RESTARTS   AGE
hivemr3-ats                 1/1     Running   0          2m30s
hivemr3-hiveserver2-x45dp   1/1     Running   0          99s
mr3master-7893-0            1/1     Running   0          88s