MR3 Shuffle Handler
MR3 provides its own shuffle handler in the runtime system. A shuffle handler is implemented as a DaemonTask, and ContainerWorkers run their own threads for shuffler handlers. By virtue of its own shuffle handler, MR3 can run in an environment where an external shuffle service is not available, most notably on Kubernetes.
Running multiple shuffle handlers
MR3 distinguishes itself from existing execution engines, such as Tez and Spark, by allowing a ContainerWorker to run multiple shuffle handlers concurrently. In the case of Tez, only a single shuffle handler can run on each node in the cluster, which implies that all Tez containers on a node share the common shuffle handler. In the case of Spark, a worker daemon can run only a single shuffle handler. In contrast, a ContainerWorker of MR3 can run multiple shuffle handlers of its own.
The support for multiple shuffle handlers in a single ContainerWorker is an important feature which, in conjunction with Speculative Execution, enables MR3 to circumvent fetch delays. For more details, see Eliminating Fetch Delays.