This page lists known issues with Hive 3 on MR3. Note that here we do not report known bugs in Apache Hive 3, whether fixed in Apache Hive 4 or not. For asking questions on MR3, please visit MR3 Google Group.
1. LOG_LEVEL
in kubernetes/env.sh
Currently the environment variable LOG_LEVEL
is not used.
LOG_LEVEL=INFO
To change the logging level for Metastore and HiveServer2,
update kubernetes/conf/hive-log4j2.properties
.
2. Outer joins failing with NullPointerException (mapjoin_filter_on_outerjoin.q).
Outer joins may fail with NullPointerException if hive.auto.convert.join
is set to true.
SELECT * FROM src1
RIGHT OUTER JOIN src1 src2 ON (src1.key = src2.key AND src1.key < 10 AND src2.key > 10)
JOIN src src3 ON (src2.key = src3.key AND src3.key < 300)
SORT BY src1.key, src2.key, src3.key;
Caused by: java.lang.NullPointerException
at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.getFilterTag(CommonJoinOperator.java:802)
at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genObject(CommonJoinOperator.java:600)
In such a case, set hive.merge.nway.joins
to false.
3. Computing avg()
failing with ClassCastException (cbo_rp_gby_empty.q).
Computing avg()
over an int
column may fail with ClassCastException.
SELECT 'avg' AS key, avg(c_int) AS value FROM cbo_t3;
Caused by: java.lang.ClassCastException: org.apache.hadoop.hive.ql.exec.vector.LongColumnVector cannot be cast to org.apache.hadoop.hive.ql.exec.vector.DoubleColumnVector
at org.apache.hadoop.hive.ql.exec.vector.expressions.gen.DoubleColDivideLongColumn.evaluate(DoubleColDivideLongColumn.java:67)
In such a case, set hive.cbo.returnpath.hiveop
to false (which is the default value).
4. Query failing with URISyntaxException (cbo_rp_auto_join1.q).
A query may fail with URISyntaxException if it involves merging.
SELECT COUNT(*) FROM
(
SELECT key, COUNT(*) FROM
(
SELECT a.key AS key, a.value AS val1, b.value AS val2 FROM tbl1_n13 a JOIN tbl2_n12 b ON a.key = b.key
) subq1
GROUP BY key
) subq2;
Caused by: java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: subq2:subq1:amerge.xml
at org.apache.hadoop.fs.Path.initialize(Path.java:259) ~[hadoop-common-3.1.2.jar:?]
...
at org.apache.hadoop.hive.ql.exec.Utilities.getPlanPath(Utilities.java:639) ~[hive-exec-3.1.3.jar:3.1.3]
at org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:415) ~[hive-exec-3.1.3.jar:3.1.3]
at org.apache.hadoop.hive.ql.exec.Utilities.getMergeWork(Utilities.java:379) ~[hive-exec-3.1.3.jar:3.1.3]
In such a case, set hive.cbo.returnpath.hiveop
to false (which is the default value).
(Setting hive.rpc.query.plan
to true may not help.)
5. Windowing and analytic functions (vector_ptf_part_simple.q)
For windowing and analytic functions, the result may not be the same as in Hive on Tez or Hive-LLAP
if ORDER BY
is not used in the OVER
clause.
This is not a bug because the result depends on partitioning for a particular column.
SELECT
row_number() OVER(PARTITION BY p_mfgr) AS rn,
row_number() OVER(PARTITION BY p_mfgr RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS rn,
row_number() OVER(PARTITION BY p_mfgr ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS rn,
sum(p_retailprice) OVER(PARTITION BY p_mfgr ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS s,
...
6. Inserting into ORC tables failing with NullPointerException (orc_merge10.q)
When both hive.llap.io.enabled
and hive.merge.tezfiles
are set to true,
inserting into partitioned ORC tables may fail with NullPointerException.
CREATE TABLE orcfile_merge1b_n1 (key int, value string)
PARTITIONED BY (ds string, part string) STORED AS orc;
INSERT OVERWRITE TABLE orcfile_merge1b_n1 PARTITION (ds='1', part)
SELECT key, value, pmod(hash(key), 2) AS part
FROM src;
Caused by: java.lang.NullPointerException
at org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatchCtx.addPartitionColsToBatch(VectorizedRowBatchCtx.java:560)
at org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatchCtx.addPartitionColsToBatch(VectorizedRowBatchCtx.java:388)
at org.apache.hadoop.hive.llap.io.api.impl.LlapRecordReader.next(LlapRecordReader.java:317)
In such a case, set hive.merge.orcfile.stripe.level
to true (which is the default value).
7. load data
producing wrong results (mm_loaddata.q)
If hive.llap.io.enabled
is set to true,
setting tez.grouping.min-size
to too small a value (e.g., 1) may produce wrong results.
CREATE TABLE load0_mm (key string, value string) STORED AS textfile TBLPROPERTIES("transactional"="true", "transactional_properties"="insert_only");
LOAD DATA LOCAL INPATH 'data/files/kv2.txt' INTO TABLE load0_mm;
In practice,
tez.grouping.min-size
is usually set to a large value (e.g., the default value of 50 * 1024 * 1024 = 52428800),
so this is not a problem.