Hive can run on top of MR3 on Hadoop. In order to exploit new features in MR3 such as running concurrent DAGs in the same ApplicationMaster and sharing containers among DAGs, Hive on MR3 is built on a modified backend of Hive. (The modified backend of Hive is not compatible with Tez.) Hive on MR3 calls a Tez runtime to execute Hive queries and relies on MR3 for the rest such as scheduling DAGs, creating containers, messaging, authenticating and authorizing users, and so on. For the instruction on installing Hive on MR3 on Hadoop, see Installing on Hadoop.
There are three versions of Hive that run on MR3:
- Hive 2.3.6
- Hive 3.1.2
- Hive 4.0.0-SNAPSHOT
Hive 1.2.2 is no longer supported from MR3 1.1. Hive on MR3 uses Tez 0.9.1 runtime with additional patches applied.
In comparison with Hive on Tez, Hive on MR3 generally runs faster for sequential queries by virtue of the simple architectural design of MR3. In particular, it makes a much better utilization of computing resources and thus yields a higher throughput for concurrent queries because MR3 allows concurrent DAGs in the same ApplicationMaster to share containers.