A Timeline Server in a Kubernetes cluster enables DAGAppMaster to report the status of every DAG. Then the user can check the progress of running DAGs and the history of completed DAGs by running MR3-UI on the web browser.

Building a Docker image for Timeline Server (Optional)

The user can use a pre-built Docker image from DockerHub (mr3project/ats-2.7.7).

To build a Docker images for Timeline Server, download the source code of Timeline Server from Timeline Server for MR3 on Kubernetes (using the master-mr3 branch), which is a variant of Timeline Server included in the Hadoop 2.7.7 distribution. Then set the following environment variable in env.sh to specify the directory of the source code:

$ vi env.sh

# Timeline-MR3 source directory
ATS_MR3_SRC=~/timeline-mr3

To compile Timeline Server, install protoc for Protobuf 2.5 and execute ats/compile-ats.sh with the following option:

<mvn option>              # Add a Maven option; may be repeated at the end.

Then collect all necessary files in the directory kubernetes/ats by executing the script build-k8s-ats.sh.

$ ./build-k8s-ats.sh 
$ du -hs kubernetes/ats/
78M kubernetes/ats/
$ ls kubernetes/ats/
Dockerfile           hadoop/              timeline-service.sh  
$ ls kubernetes/ats/hadoop/apache-hadoop/
bin  etc  libexec  share

Similarly to building a Docker image for running Hive on MR3, the user should set two environment variables in kubernetes/env.sh:

DOCKER_ATS_IMG=10.1.91.17:5000/ats-2.7.7:latest
  • DOCKER_ATS_IMG specifies the full name of the Docker image (including a tag) for running Timeline Server which may include the address of a running Docker server.

The last step is to build a Docker image from Dockerfile in the directory kubernetes/ats by executing kubernetes/build-ats.sh. The script builds a Docker image (which contains everything for running Timeline Server) and registers it to the Docker server specified in kubernetes/env.sh. If successful, the user can pull the Docker image on another node:

$ docker pull 10.1.91.17:5000/ats-2.7.7
Using default tag: latest
Trying to pull repository 10.1.91.17:5000/ats-2.7.7 ... 
latest: Pulling from 10.1.91.17:5000/ats-2.7.7
...
2f0e7389970d: Pull complete 
Digest: sha256:9d34e72b6841bb1ee3818c06a556d32799a5837cea31d41923384c6a2f4690a2
Status: Downloaded newer image for 10.1.91.17:5000/ats-2.7.7:latest

Configuring the Pod for Timeline Server

The following files specify how to configure Kubernetes objects for Timeline Server:

└── kubernetes
    ├── env.sh
    ├── ats-key
    └── yaml
        ├── ats-service.yaml
        ├── ats.yaml
        ├── namespace.yaml
        ├── workdir-pvc-ats.yaml
        └── workdir-pv-ats.yaml

We assume that Timeline Server belongs to the same namespace as HiveServer2, and reuse namespace.yaml. Timeline Server uses workdir-pvc-ats.yaml and workdir-pv-ats.yaml which can be configured similarly to workdir-pvc.yaml and workdir-pv.yaml.

The user should set the following environment variable in kubernetes/env.sh.

$ vi kubernetes/env.sh

CREATE_ATS_SECRET=true
  • CREATE_ATS_SECRET specifies whether or not to create a Secret from keytab files in the directory kubernetes/ats-key. It should be set to true if Kerberos is used for authentication.

ats-service.yaml

This file creates a Service for exposing Timeline Server to the outside of the Kubernetes cluster. The user should specify a public IP address with a valid host name and two port numbers for Timeline Server so that both clients from the outside and DAGAppMaster from the inside can connect to it using the host name.

$ vi kubernetes/yaml/ats-service.yaml

  ports:
  - name: ats-http
    protocol: TCP
    port: 8188
    targetPort: 8188
  - name: ats-https
    protocol: TCP
    port: 8190
    targetPort: 8190
  externalIPs:
  - 10.1.91.41

In our example, we use 10.1.91.41:8188 as the HTTP address and 10.1.91.41:8190 as the HTTPS address of Timeline Server. The user should make sure that the IP address exists with a valid host name and is not already taken.

ats.yaml

This file creates a Pod for running Timeline Server. Most of the sections in it work okay with default settings, except for the spec/containers section which should be update according to Kubernetes cluster settings.

  • The image field should match the Docker image specified by DOCKER_ATS_IMG in kubernetes/env.sh.
  • The resources/requests and resources/limits specify the resources to be allocated to a Timeline Server Pod.
  • The ports/containerPort fields should match the port numbers specified in ats-service.yaml.
$ vi kubernetes/yaml/ats.yaml

spec:
  containers:
  - image: 10.1.91.17:5000/ats-2.7.7
    resources:
      requests:
        cpu: 1
        memory: 4Gi
      limits:
        cpu: 1
        memory: 4Gi
    ports:
    - containerPort: 8188
      protocol: TCP
    - containerPort: 8190
      protocol: TCP

Configuring Timeline Server

The following files specify how to configure Timeline Server:

└── kubernetes
    ├── env.sh
    └── ats-conf
        ├── core-site.xml
        ├── krb5.conf
        ├── log4j.properties
        ├── ssl-server.xml
        └── yarn-site.xml

The configuration files in the directory kubernetes/ats-conf should work okay on a typical Kubernetes cluster. Below we describe those sections that are specific to each Kubernetes cluster.

yarn-site.xml

The user should update the following configurations if the port numbers specified in ats-service.yaml are different from default values of 8188 and 8190:

  • yarn.timeline-service.webapp.address for the HTTP address
  • yarn.timeline-service.webapp.https.address for the HTTPS address

To use Kerberos-based authentication, set the configuration key yarn.timeline-service.http-authentication.type to kerberos and use a Kerberos keytab file as shown below. The service principal should use the host name for the Service for Timeline Server (e.g., indigo20 in HTTP/indigo20@RED which corresponds to externalIPs in ats-service.yaml).

$ vi kubernetes/ats-conf/yarn-site.xml

<property>
  <name>yarn.timeline-service.http-authentication.type</name>
  <value>kerberos</value>
</property>

<property>
  <name>yarn.timeline-service.http-authentication.kerberos.principal</name>
  <value>HTTP/indigo20@RED</value>
</property>

<property>
  <name>yarn.timeline-service.http-authentication.kerberos.keytab</name>
  <value>/opt/mr3-run/ats/key/spnego.service.keytab</value>
</property>

If Kerberos-based authentication is not used, set the configuration key yarn.timeline-service.http-authentication.type to simple.

krb5.conf

If Kerberos-based authentication is used, this file should contains the information for Kerberos configuration. Usually it suffices to use a copy of kubernetes/conf/krb5.conf.

Running Timeline Server

Before running Timeline Server, the user should make sure that HiveServer2 is not running. In order to run Timeline Server, the user can execute the script kubernetes/run-ats.sh:

$ kubernetes/run-ats.sh 
...
ATS_SECRET_KEY=22f767f8-7c56-421d-ac36-f2cf2392c1ba
configmap/client-ats-config created
secret/hivemr3-ats-secret created
pod/hivemr3-ats created
service/timelineserver created

The user should keep the string set in the environment variable ATS_SECRET_KEY which is automatically generated unless the user explicitly sets it before executing kubernetes/run-ats.sh. The string is later passed to DAGAppMaster so that both Timeline Server and DAGAppMaster can share a common secret. (Otherwise DAGAppMaster would not be able to communicate with Timeline Server.)

Executing the script kubernetes/run-ats.sh starts a Timeline Server Pod in a moment:

$ kubectl get -n hivemr3 pods
NAME          READY   STATUS    RESTARTS   AGE
hivemr3-ats   1/1     Running   0          52s

Running HiveServer2 after starting Timeline Server

Now that Timeline Server is running, the user should set the environment variable ATS_SECRET_KEY to the string reported in the previous step. Then executing the script kubernetes/run-hive.sh start a HiveServer2 Pod which in turn creates a DAGAppMaster Pod ready to communicate with Timeline Server.

$ export ATS_SECRET_KEY=22f767f8-7c56-421d-ac36-f2cf2392c1ba
$ kubernetes/run-hive.sh 
...
CLIENT_TO_AM_TOKEN_KEY=76a62e5a-ace3-44ee-9252-01baafbe8dd6
MR3_APPLICATION_ID_TIMESTAMP=17893
MR3_SHARED_SESSION_ID=0f52f3b5-563a-448d-9848-3f54c27f53b6
ATS_SECRET_KEY=22f767f8-7c56-421d-ac36-f2cf2392c1ba
configmap/client-am-config created
deployment/hivemr3-hiveserver2 created
service/hiveserver2 created
$ kubectl get -n hivemr3 pods
NAME                        READY   STATUS    RESTARTS   AGE
hivemr3-ats                 1/1     Running   0          2m30s
hivemr3-hiveserver2-x45dp   1/1     Running   0          99s
mr3master-7893-0            1/1     Running   0          88s