Performance Evaluation of SQL-on-Hadoop Systems using the TPC-DS Benchmark
Introduction
We often ask questions on the performance of SQL-on-Hadoop systems:
- How fast is Hive-LLAP in comparison with Presto, SparkSQL, or Hive on Tez?
- As it is an MPP-style system, does Presto run the fastest if it successfully executes a query?
- As it stores intermediate data in memory, does SparkSQL run much faster than Hive on Tez in general?
- What is the best system for running concurrent queries?
- ...