Skip to main content

5 posts tagged with "LLAP"

View All Tags

Hive vs SparkSQL: Hive-LLAP, Hive on MR3, SparkSQL 2.3.2

· 5 min read
Sungwoo Park
MR3 Architect and Developer

Introduction

In our previous article published in October 2018, we use the TPC-DS benchmark to compare the performance of Hive-LLAP and SparkSQL 2.3.1 included in HDP 3.0.1 along with Hive 3.1.0 on MR3 0.4. In this article, we update the result by testing SparkSQL 2.3.2 included in HDP 3.1.4. As in the previous experiment, we use the TPC-DS benchmark.

Performance Evaluation of SQL-on-Hadoop Systems using the TPC-DS Benchmark

· 17 min read
Sungwoo Park
MR3 Architect and Developer

Introduction

We often ask questions on the performance of SQL-on-Hadoop systems:

  • How fast is Hive-LLAP in comparison with Presto, SparkSQL, or Hive on Tez?
  • As it is an MPP-style system, does Presto run the fastest if it successfully executes a query?
  • As it stores intermediate data in memory, does SparkSQL run much faster than Hive on Tez in general?
  • What is the best system for running concurrent queries?
  • ...

Hive on MR3 0.2 vs Hive-LLAP

· 11 min read
Sungwoo Park
MR3 Architect and Developer

Introduction

Hive running on top of MR3 0.2, or Hive-MR3 henceforth, supports LLAP (Low Latency Analytical Processing) I/O. In conjunction with the ability to execute multiple TaskAttempts concurrently inside a single ContainerWorker, the support for LLAP I/O makes Hive-MR3 functionally equivalent to Hive-LLAP. Hence Hive-MR3 can now serve as a substitute for Hive-LLAP in typical use cases.