For asking questions on MR3, please visit MR3 Google Group.

1. When running a query, ContainerWorker Pods never get launched and Beeline gets stuck.

Try adjusting the resource for DAGAppMaster and ContainerWorker Pods. In kubernetes/conf/mr3-site.xml, the user can adjust the resource for the DAGAppMaster Pod.

<property>
  <name>mr3.am.resource.memory.mb</name>
  <value>16384</value>
</property>

<property>
  <name>mr3.am.resource.cpu.cores</name>
  <value>2</value>
</property>

In kubernetes/conf/hive-site.xml, the user can adjust the resource for ContainerWorker Pods (assuming that the configuration key hive.mr3.containergroup.scheme is set to all-in-one).

<property>
  <name>hive.mr3.map.task.memory.mb</name>
  <value>8192</value>
</property>

<property>
  <name>hive.mr3.map.task.vcores</name>
  <value>1</value>
</property>

<property>
  <name>hive.mr3.reduce.task.memory.mb</name>
  <value>8192</value>
</property>

<property>
  <name>hive.mr3.reduce.task.vcores</name>
  <value>1</value>
</property>

<property>
  <name>hive.mr3.all-in-one.containergroup.memory.mb</name>
  <value>16384</value>
</property>

<property>
  <name>hive.mr3.all-in-one.containergroup.vcores</name>
  <value>2</value>
</property>

2. A query fails with a message No space available in any of the local directories

A query may fail with the following error from Beeline:

ERROR : Terminating unsuccessfully: Vertex failed, vertex_2134_0000_1_01, Some(Task unsuccessful: Map 1, task_2134_0000_1_01_000000, java.lang.RuntimeException: org.apache.hadoop.util.DiskChecker$DiskErrorException: No space available in any of the local directories.
  at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:370)
...
Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: No space available in any of the local directories.

In such a case, check if the configuration key mr3.k8s.pod.worker.hostpaths in kubernetes/conf/mr3-site.xml is properly set, e.g.:

<property>
  <name>mr3.k8s.pod.worker.hostpaths</name>
  <value>/data1/k8s,/data2/k8s,/data3/k8s,/data4/k8s,/data5/k8s,/data6/k8s</value>
</property>

In addition, check if the directories listed in mr3.k8s.pod.worker.hostpaths are writable to the user running Pods.

3. A query accessing S3 fails with AMInputInitializerException from Beeline.

Depending on the settings for S3 buckets and the properties of datasets, the user may have to increase the value for the configuration key fs.s3a.connection.maximum (e.g., to 2000) and set the configuration key fs.s3a.connection.ssl.enabled to false in kubernetes/conf/core-site.xml. For example, with too small a value for fs.s3a.connection.maximum, Beeline and the DAGAppMaster Pod may generate the following errors when accessing S3:

### from Beeline 
ERROR : FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Terminating unsuccessfully: Vertex failed, vertex_22169_0000_1_02, Some(RootInput web_sales failed on Vertex Map 1: com.datamonad.mr3.api.common.AMInputInitializerException: web_sales)Map 1            1 task           2922266 milliseconds: Failed
### from the DAGAppMaster Pod
Caused by: java.lang.RuntimeException: ORC split generation failed with exception: java.io.InterruptedIOException: Failed to open s3a://hivemr3-partitioned-2-orc/web_sales/ws_sold_date_sk=2451932/000001_0 at 14083 on s3a://hivemr3-partitioned-2-orc/web_sales/ws_sold_date_sk=2451932/000001_0: com.amazonaws.SdkClientException: Unable to execute HTTP request: Timeout waiting for connection from pool

4. A query accessing S3 makes no progress because Map vertexes get stuck in the state of Initializing.

If DAGAppMaster fails to resolve host names, the execution of a query may get stuck in the following state: hivek8s.am.stuck

In such a case, check if the configuration key mr3.k8s.host.aliases is set properly in kubernetes/conf/mr3-site.xml. For example, if the user sets the environment variable HIVE_DATABASE_HOST in env.sh to the host name (instead of the address) of the MySQL server, its address should be specified in mr3.k8s.host.aliases.

HIVE_DATABASE_HOST=orange0
<property>
  <name>mr3.k8s.host.aliases</name>
  <value>orange0=11.11.11.11</value>
</property>

Internally the class AmazonS3Client (running inside InputInitializer of MR3) throws an exception java.net.UnknownHostException, which, however, is swallowed and never propagated to DAGAppMaster. As a consequence, no error is reported to Beeline and the query gets stuck.

5. DAGAppMaster Pod does not start because mr3-conf.properties does not exist.

MR3 generates a property file mr3-conf.properties from ConfigMap mr3conf-configmap-master and mounts it inside DAGAppMaster Pod. If DAGAppMaster Pod fails with the following message, it means that either ConfigMap mr3conf-configmap-master is corrupt or mr3-conf.properties has not been generated.

2020-05-15T10:35:10,255 ERROR [main] DAGAppMaster: Error in starting DAGAppMasterjava.lang.IllegalArgumentException: requirement failed: Properties file mr3-conf.properties does not exist

In such a case, try again after manually deleting ConfigMap mr3conf-configmap-master so that Hive on MR3 on Kubernetes can start without a ConfigMap of the same name.