Hive can run on top of MR3. In order to exploit new features in MR3 such as running concurrent DAGs in the same ApplicationMaster and sharing containers among DAGs, Hive on MR3 is built on a modified backend of Hive. (The modified backend of Hive is not compatible with Tez.) Hive on MR3 calls a Tez runtime to execute Hive queries and relies on MR3 for the rest such as scheduling DAGs, creating containers, messaging, authenticating and authorizing users, and so on.
There are two versions of Hive that run on MR3:
- Hive 3.1.3
- Hive 4.0.0-SNAPSHOT
Hive 1.2 and Hive 2.3 are no longer supported from MR3 1.1. Hive on MR3 uses Tez 0.9.1 runtime with additional patches applied.
In comparison with Hive on Tez, Hive on MR3 generally runs faster for sequential queries by virtue of the simple architectural design of MR3. In particular, it makes a much better utilization of computing resources and thus yields a higher throughput for concurrent queries because MR3 allows concurrent DAGs to share containers.
For the result of evaluating the performance of Hive on MR3 using the TPC-DS benchmark, see our blog article.
For asking any questions, please visit MR3 Google Group or join MR3 Slack.