By default, Hive on MR3 uses an external shuffle service, such as Hadoop/MapReduce shuffle service, in order to send and receive intermediate data between ContainerWorkers. Hive on MR3 can also use the shuffle handler available in the runtime system of MR3.
Using the MR3 Shuffle Handler
In order to use the MR3 shuffle handler,
the user should set three configuration keys in
hive.mr3.use.daemon.shufflehandlerto a number larger than zero in
hive-site.xml. Then Hive on MR3 attaches as many DaemonTasks for MR3 shuffle handlers to ContainerGroups.
tez.shuffle.portto a port number for the shuffle handler in
hive.mr3.use.daemon.shufflehandler is set to zero
tez.am.shuffle.auxiliary-service.id is set to
ContainerWorkers fail with NullPointerException (from
Currently Hive on MR3 can use the MR3 shuffle handler only with the all-in-one ContainerGroup scheme.