A Timeline Server in a Kubernetes cluster enables DAGAppMaster to report the status of every DAG. Then the user can check the progress of running DAGs and the history of completed DAGs by running MR3-UI on the web browser.
Building a Docker image for Timeline Server (Optional)
The user can use a pre-built Docker image from DockerHub (mr3project/ats-2.7.7).
To build a Docker images for Timeline Server,
download the source code of Timeline Server from
Timeline Server for MR3 on Kubernetes
(using the master-mr3
branch),
which is a variant of Timeline Server included in the Hadoop 2.7.7 distribution.
Then set the following environment variable in env.sh
to specify the directory of the source code:
$ vi env.sh
# Timeline-MR3 source directory
ATS_MR3_SRC=~/timeline-mr3
To compile Timeline Server,
install protoc
for Protobuf 2.5 and execute ats/compile-ats.sh
with the following option:
<mvn option> # Add a Maven option; may be repeated at the end.
Then collect all necessary files in the directory kubernetes/ats
by executing the script build-k8s-ats.sh
.
$ ./build-k8s-ats.sh
$ du -hs kubernetes/ats/
78M kubernetes/ats/
$ ls kubernetes/ats/
Dockerfile hadoop/ timeline-service.sh
$ ls kubernetes/ats/hadoop/apache-hadoop/
bin etc libexec share
Similarly to building a Docker image for running Hive on MR3,
the user should set two environment variables in kubernetes/env.sh
:
DOCKER_ATS_IMG=10.1.91.17:5000/ats-2.7.7:latest
DOCKER_ATS_IMG
specifies the full name of the Docker image (including a tag) for running Timeline Server which may include the address of a running Docker server.
The last step is to build a Docker image from Dockerfile in the directory kubernetes/ats
by executing kubernetes/build-ats.sh
.
The script builds a Docker image (which contains everything for running Timeline Server)
and registers it to the Docker server specified in kubernetes/env.sh
.
If successful, the user can pull the Docker image on another node:
$ docker pull 10.1.91.17:5000/ats-2.7.7
Using default tag: latest
Trying to pull repository 10.1.91.17:5000/ats-2.7.7 ...
latest: Pulling from 10.1.91.17:5000/ats-2.7.7
...
2f0e7389970d: Pull complete
Digest: sha256:9d34e72b6841bb1ee3818c06a556d32799a5837cea31d41923384c6a2f4690a2
Status: Downloaded newer image for 10.1.91.17:5000/ats-2.7.7:latest
Configuring the Pod for Timeline Server
The following files specify how to configure Kubernetes objects for Timeline Server:
└── kubernetes
├── env.sh
├── ats-key
└── yaml
├── ats-service.yaml
├── ats.yaml
├── namespace.yaml
├── workdir-pvc-ats.yaml
└── workdir-pv-ats.yaml
We assume that Timeline Server belongs to the same namespace as HiveServer2, and reuse namespace.yaml
.
Timeline Server uses workdir-pvc-ats.yaml
and workdir-pv-ats.yaml
which can be configured similarly to workdir-pvc.yaml
and workdir-pv.yaml
.
The user should set the following environment variable in kubernetes/env.sh
.
$ vi kubernetes/env.sh
CREATE_ATS_SECRET=true
CREATE_ATS_SECRET
specifies whether or not to create a Secret from keytab files in the directorykubernetes/ats-key
. It should be set to true if Kerberos is used for authentication.
ats-service.yaml
This file creates a Service for exposing Timeline Server to the outside of the Kubernetes cluster. The user should specify a public IP address with a valid host name and two port numbers for Timeline Server so that both clients from the outside and DAGAppMaster from the inside can connect to it using the host name.
$ vi kubernetes/yaml/ats-service.yaml
ports:
- name: ats-http
protocol: TCP
port: 8188
targetPort: 8188
- name: ats-https
protocol: TCP
port: 8190
targetPort: 8190
externalIPs:
- 10.1.91.41
In our example, we use 10.1.91.41:8188 as the HTTP address and 10.1.91.41:8190 as the HTTPS address of Timeline Server. The user should make sure that the IP address exists with a valid host name and is not already taken.
ats.yaml
This file creates a Pod for running Timeline Server.
Most of the sections in it work okay with default settings,
except for the spec/containers
section which should be update according to Kubernetes cluster settings.
- The
image
field should match the Docker image specified byDOCKER_ATS_IMG
inkubernetes/env.sh
. - The
resources/requests
andresources/limits
specify the resources to be allocated to a Timeline Server Pod. - The
ports/containerPort
fields should match the port numbers specified inats-service.yaml
.
$ vi kubernetes/yaml/ats.yaml
spec:
containers:
- image: 10.1.91.17:5000/ats-2.7.7
resources:
requests:
cpu: 1
memory: 4Gi
limits:
cpu: 1
memory: 4Gi
ports:
- containerPort: 8188
protocol: TCP
- containerPort: 8190
protocol: TCP
Configuring Timeline Server
The following files specify how to configure Timeline Server:
└── kubernetes
├── env.sh
└── ats-conf
├── core-site.xml
├── krb5.conf
├── log4j.properties
├── ssl-server.xml
└── yarn-site.xml
The configuration files in the directory kubernetes/ats-conf
should work okay on a typical Kubernetes cluster.
Below we describe those sections that are specific to each Kubernetes cluster.
yarn-site.xml
The user should update the following configurations if the port numbers specified in ats-service.yaml
are different
from default values of 8188 and 8190:
yarn.timeline-service.webapp.address
for the HTTP addressyarn.timeline-service.webapp.https.address
for the HTTPS address
To use Kerberos-based authentication,
set the configuration key yarn.timeline-service.http-authentication.type
to kerberos
and use a Kerberos keytab file as shown below.
The service principal should use the host name for the Service for Timeline Server
(e.g., indigo20
in HTTP/indigo20@RED
which corresponds to externalIPs
in ats-service.yaml
).
$ vi kubernetes/ats-conf/yarn-site.xml
<property>
<name>yarn.timeline-service.http-authentication.type</name>
<value>kerberos</value>
</property>
<property>
<name>yarn.timeline-service.http-authentication.kerberos.principal</name>
<value>HTTP/indigo20@RED</value>
</property>
<property>
<name>yarn.timeline-service.http-authentication.kerberos.keytab</name>
<value>/opt/mr3-run/ats/key/spnego.service.keytab</value>
</property>
If Kerberos-based authentication is not used,
set the configuration key yarn.timeline-service.http-authentication.type
to simple
.
krb5.conf
If Kerberos-based authentication is used,
this file should contains the information for Kerberos configuration.
Usually it suffices to use a copy of kubernetes/conf/krb5.conf
.
Running Timeline Server
Before running Timeline Server, the user should make sure that HiveServer2 is not running.
In order to run Timeline Server, the user can execute the script kubernetes/run-ats.sh
:
$ kubernetes/run-ats.sh
...
ATS_SECRET_KEY=22f767f8-7c56-421d-ac36-f2cf2392c1ba
configmap/client-ats-config created
secret/hivemr3-ats-secret created
pod/hivemr3-ats created
service/timelineserver created
The user should keep the string set in the environment variable ATS_SECRET_KEY
which is automatically generated unless the user explicitly sets it before executing kubernetes/run-ats.sh
.
The string is later passed to DAGAppMaster so that both Timeline Server and DAGAppMaster can share a common secret.
(Otherwise DAGAppMaster would not be able to communicate with Timeline Server.)
Executing the script kubernetes/run-ats.sh
starts a Timeline Server Pod in a moment:
$ kubectl get -n hivemr3 pods
NAME READY STATUS RESTARTS AGE
hivemr3-ats 1/1 Running 0 52s
Running HiveServer2 after starting Timeline Server
Now that Timeline Server is running,
the user should set the environment variable ATS_SECRET_KEY
to the string reported in the previous step.
Then executing the script kubernetes/run-hive.sh
start a HiveServer2 Pod which in turn creates a DAGAppMaster Pod
ready to communicate with Timeline Server.
$ export ATS_SECRET_KEY=22f767f8-7c56-421d-ac36-f2cf2392c1ba
$ kubernetes/run-hive.sh
...
CLIENT_TO_AM_TOKEN_KEY=76a62e5a-ace3-44ee-9252-01baafbe8dd6
MR3_APPLICATION_ID_TIMESTAMP=17893
MR3_SHARED_SESSION_ID=0f52f3b5-563a-448d-9848-3f54c27f53b6
ATS_SECRET_KEY=22f767f8-7c56-421d-ac36-f2cf2392c1ba
configmap/client-am-config created
deployment/hivemr3-hiveserver2 created
service/hiveserver2 created
$ kubectl get -n hivemr3 pods
NAME READY STATUS RESTARTS AGE
hivemr3-ats 1/1 Running 0 2m30s
hivemr3-hiveserver2-x45dp 1/1 Running 0 99s
mr3master-7893-0 1/1 Running 0 88s