A Timeline Server in a Kubernetes cluster enables DAGAppMaster to report the status of every DAG. Then the user can check the progress of running DAGs and the history of completed DAGs by running MR3-UI on the web browser:
Building a Docker image for Timeline Server
To build a Docker images for Timeline Server,
download the source code of Timeline Server from
Timeline Server for MR3 on Kubernetes
(using the master-mr3
branch),
which is a variant of Timeline Server included in the Hadoop 2.7.7 distribution.
Then set the following environment variable in env.sh
to specify the directory of the source code:
# Timeline-MR3 source directory
ATS_MR3_SRC=~/timeline-mr3
To compile Timeline Server,
install protoc
for Protobuf 2.5 and execute ats/compile-ats.sh
with the following option:
<mvn option> # Add a Maven option; may be repeated at the end.
Then collect all necessary files in the directory kubernetes/ats
by executing the script build-k8s-ats.sh
.
$ ./build-k8s-ats.sh
$ du -hs kubernetes/ats/
78M kubernetes/ats/
$ ls kubernetes/ats/
Dockerfile hadoop/ timeline-service.sh
$ ls kubernetes/ats/hadoop/apache-hadoop/
bin etc libexec share
Similarly to building a Docker image for running Hive on MR3,
the user should set two environment variables in kubernetes/env.sh
:
DOCKER_ATS_IMG=10.1.91.17:5000/ats-2.7.7:latest
DOCKER_ATS_IMG
specifies the full name of the Docker image (including a tag) for running Timeline Server which may include the address of a running Docker server.
The last step is to build a Docker image from Dockerfile in the directory kubernetes/ats
by executing kubernetes/build-ats.sh
.
The script builds a Docker image (which contains everything for running Timeline Server)
and registers it to the Docker server specified in kubernetes/env.sh
.
If successful, the user can pull the Docker image on another node:
$ docker pull 10.1.91.17:5000/ats-2.7.7
Using default tag: latest
Trying to pull repository 10.1.91.17:5000/ats-2.7.7 ...
latest: Pulling from 10.1.91.17:5000/ats-2.7.7
...
2f0e7389970d: Pull complete
Digest: sha256:9d34e72b6841bb1ee3818c06a556d32799a5837cea31d41923384c6a2f4690a2
Status: Downloaded newer image for 10.1.91.17:5000/ats-2.7.7:latest
Configuring the Pod for Timeline Server
The following files specify how to configure Kubernetes objects for Timeline Server:
└── kubernetes
├── env.sh
├── ats-key
└── yaml
├── ats-service.yaml
├── ats.yaml
├── namespace.yaml
├── workdir-pvc-ats.yaml
└── workdir-pv-ats.yaml
We assume that Timeline Server belongs to the same namespace as HiveServer2, and reuse namespace.yaml
.
Timeline Server uses workdir-pvc-ats.yaml
and workdir-pv-ats.yaml
which can be configured similarly to workdir-pvc.yaml
and workdir-pv.yaml
.
The user should set the following environment variable in kubernetes/env.sh
.
CREATE_ATS_SECRET=true
CREATE_ATS_SECRET
specifies whether or not to create a Secret from keytab files in the directorykubernetes/ats-key
. It should be set to true if Kerberos is used for authentication.
ats-service.yaml
This file creates a Service for exposing Timeline Server to the outside of the Kubernetes cluster. The user should specify a public IP address with a valid host name and two port numbers for Timeline Server so that both clients from the outside and DAGAppMaster from the inside can connect to it using the host name.
ports:
- name: ats-http
protocol: TCP
port: 8188
targetPort: 8188
- name: ats-https
protocol: TCP
port: 8190
targetPort: 8190
externalIPs:
- 10.1.91.41
The sample file in the MR3 release uses 10.1.91.41:8188 as the HTTP address and 10.1.91.41:8190 as the HTTPS address of Timeline Server. The user should make sure that the IP address exists with a valid host name and is not already taken.
ats.yaml
This file creates a Pod for running Timeline Server.
Most of the sections in it work okay with default settings,
except for the spec/containers
section which should be update according to Kubernetes cluster settings.
- The
image
field should match the Docker image specified byDOCKER_ATS_IMG
inkubernetes/env.sh
. - The
resources/requests
andresources/limits
specify the resources to be allocated to a Timeline Server Pod. - The
ports/containerPort
fields should match the port numbers specified inats-service.yaml
.
spec:
containers:
- image: 10.1.91.17:5000/ats-2.7.7
resources:
requests:
cpu: 1
memory: 4Gi
limits:
cpu: 1
memory: 4Gi
ports:
- containerPort: 8188
protocol: TCP
- containerPort: 8190
protocol: TCP
Configuring Timeline Server
The following files specify how to configure Timeline Server:
└── kubernetes
├── env.sh
└── ats-conf
├── core-site.xml
├── krb5.conf
├── log4j.properties
├── ssl-server.xml
└── yarn-site.xml
The following code shows part of kubernetes/env.sh
for specifying the heap size (in MB) for Timeline Server:
ATS_HEAPSIZE=2048
The configuration files in the directory kubernetes/ats-conf
should work okay on a typical Kubernetes cluster.
Below we describe those sections that are specific to each Kubernetes cluster.
yarn-site.xml
The user should update the following configurations if the port numbers specified in ats-service.yaml
are different
from default values of 8188 and 8190:
yarn.timeline-service.webapp.address
for the HTTP addressyarn.timeline-service.webapp.https.address
for the HTTPS address
The default configuration in kubernetes/ats-conf/yarn-site.xml
assumes that
a service keytab file spnego.service.keytab
for service principal HTTP/indigo20@RED
is ready in the directory kubernetes/ats-key
.
<property>
<name>yarn.timeline-service.http-authentication.kerberos.principal</name>
<value>HTTP/indigo20@RED</value>
</property>
<property>
<name>yarn.timeline-service.http-authentication.kerberos.keytab</name>
<value>/opt/mr3-run/ats/key/spnego.service.keytab</value>
</property>
The user should update these two configuration keys in order to use a different service keytab file
or a different service principal.
In particular, the service principal should use the host name for the Service for Timeline Server
(e.g., indigo20
in HTTP/indigo20@RED
which corresponds to externalIPs
in ats-service.yaml
).
krb5.conf
This file should contains the information for Kerberos configuration.
Usually it suffices to use a copy of kubernetes/conf/krb5.conf
.
Running Timeline Server
Before running Timeline Server, the user should make sure that HiveServer2 is not running.
In order to run Timeline Server, the user can execute the script kubernetes/run-ats.sh
:
$ kubernetes/run-ats.sh
...
ATS_SECRET_KEY=22f767f8-7c56-421d-ac36-f2cf2392c1ba
configmap/client-ats-config created
secret/hivemr3-ats-secret created
pod/hivemr3-ats created
service/timelineserver created
The user should keep the string set in the environment variable ATS_SECRET_KEY
which is automatically generated unless the user explicitly sets it before executing kubernetes/run-ats.sh
.
The string is later passed to DAGAppMaster so that both Timeline Server and DAGAppMaster can share a common secret.
(Otherwise DAGAppMaster would not be able to communicate with Timeline Server.)
Executing the script kubernetes/run-ats.sh
starts a Timeline Server Pod in a moment:
$ kubectl get -n hivemr3 pods
NAME READY STATUS RESTARTS AGE
hivemr3-ats 1/1 Running 0 52s
Running HiveServer2 after starting Timeline Server
Now that Timeline Server is running,
the user should set the environment variable ATS_SECRET_KEY
to the string reported in the previous step.
Then executing the script kubernetes/run-hive.sh
start a HiveServer2 Pod which in turn creates a DAGAppMaster Pod
ready to communicate with Timeline Server.
$ export ATS_SECRET_KEY=22f767f8-7c56-421d-ac36-f2cf2392c1ba
$ kubernetes/run-hive.sh
...
CLIENT_TO_AM_TOKEN_KEY=76a62e5a-ace3-44ee-9252-01baafbe8dd6
MR3_APPLICATION_ID_TIMESTAMP=17893
MR3_SHARED_SESSION_ID=0f52f3b5-563a-448d-9848-3f54c27f53b6
ATS_SECRET_KEY=22f767f8-7c56-421d-ac36-f2cf2392c1ba
configmap/client-am-config created
replicationcontroller/hivemr3-hiveserver2 created
service/hiveserver2 created
$ kubectl get -n hivemr3 pods
NAME READY STATUS RESTARTS AGE
hivemr3-ats 1/1 Running 0 2m30s
hivemr3-hiveserver2-x45dp 1/1 Running 0 99s
mr3master-7893-0 1/1 Running 0 88s