Release 1.12: 2024-11-13
MR3
- Implement delay scheduling. Enabling delay scheduling with the configuration key
mr3.taskattempt.queue.scheme.use.delay
is recommended when using LLAP I/O. mr3.am.session.share.dag.client.rpc
specifies whether or not to create a new DAGClientRPC object for each DAG (in session mode).- Introduce
mr3.yarn.priority
to specify the priority of the MR3 Yarn application. - Support fault tolerance when using Celeborn 0.5.1.
Hive on MR3
- Support Hive 4.0.1.
Release 1.11: 2024-7-21
MR3
- Introduce
mr3.dag.timeout.kill.threshold.secs
andmr3.dag.timeout.kill.check.ms
for checking DAG timeout. mr3.daemon.task.message.buffer.size
specifies the message queue size for DaemonTasks.
Hive on MR3
- The cache hit ratio of LLAP I/O is usually higher and more stable because LLAP I/O (with LlapInputFormat) is used only when a Task is placed on nodes matching its location hints.
- LimitOperator is correctly controlled by MR3 DAGAppMaster (which implements HIVE-24207).
- Support Hive 4.0.0.
Release 1.10: 2024-3-12
MR3
- Every ContainerWorker runs a central shuffle server which manages all Fetchers from all TaskAttempts.
- All Fetchers share a common thread pool.
- The shuffle server does not distinguish between ordered and ordered fetches.
- The shuffle server controls the maximum number of concurrent fetches for each input (with
tez.runtime.shuffle.parallel.copies
). - The shuffle server controls the total number of concurrent fetches (with
tez.runtime.shuffle.total.parallel.copies
).
- Adjust the default configuration in
tez-site.xml
for shuffling:tez.runtime.shuffle.parallel.copies
to 10tez.runtime.shuffle.total.parallel.copies
to 360tez.runtime.shuffle.read.timeout
to 60000 (60 seconds)
- Introduce
mr3.dag.create.daemon.vertex.always
to control whether or not to create DaemonVertexes in DAGs (with the default value of false). - Fix a bug in speculative execution where a Task is killed after OutOfMemoryError while TaskAttempts are still running.
Release 1.9: 2024-1-7
MR3
- Introduce
tez.runtime.use.free.memory.fetched.input
to use free memory for storing fetched data. - The default value of
tez.runtime.transfer.data-via-events.max-size
increases from 512 to 2048. - Tasks can be canceled if no more output records are needed (as part of incorporating HIVE-24207).
Hive on MR3
- Execute TRUNCATE using MR3 instead of MapReduce.
hive.exec.orc.default.compress
is set to SNAPPY inhive-site.xml
.- Support Ranger 2.4.0.
- Adjust the default configuration in
hive-site.xml
andtez-site.xml
to use auto parallelism less aggressively.tez.shuffle-vertex-manager.auto-parallel.min.num.tasks
to 251tez.shuffle-vertex-manager.auto-parallel.max.reduction.percentage
to 50
- Set
metastore.stats.fetch.bitvector
to true inhive-site.xml
.
Release 1.8: 2023-12-9
MR3
- Shuffle handlers can send multiple consecutive partitions at once.
- Fix a bug in TaskScheduler which can get stuck when the number of ContainerWorkers is smaller than the value for mr3.am.task.max.failed.attempts.
- Avoid unnecessary attempts to delete directores created by DAGs.
mr3.taskattempt.queue.scheme
can be set tospark
to use a Spark-style TaskScheduler which schedules consumer Tasks after all producer Tasks are finished.mr3.dag.vertex.schedule.by.stage
can be set to true to process Vertexes by stages similarly to Spark.- YarnResourceScheduler does not use AMRMClient.getAvailableResources() which returns incorrect values in some cases.
- Restore
TEZ_USE_MINIMAL
inenv.sh
. - Support Celeborn as remote suffle service.
mr3.dag.include.indeterminate.vertex
specifies whether a DAG contains indeterminate Vertexes or not.- Fault tolerance in the event of disks failures works much faster.
- Use Scala 2.12.
- Support Java 17 (with
USE_JAVA_17
inenv.sh
).
Hive on MR3
- Fix ConcurrentModificationException generated during the construction of DAGs.
hive.mr3.application.name.prefix
specifies the prefix of MR3 application names.- Fix a bug that ignores CTRL-C in Beeline and stop request from Hue.
hive.mr3.config.remove.keys
specifies configuration keys to remove from JobConf to be passed to Tez.hive.mr3.config.remove.prefixes
specifies prefixes of configuration keys to remove from JobConf to be passed to Tez.
Release 1.7: 2023-5-15
MR3
- Support standalone mode which does not require Yarn or Kubernetes as the resource manager.
Hive on MR3
- Use Hadoop 3.3.1.
hive.query.reexecution.stats.persist.scope
can be set tohiveserver
.HIVE_JVM_OPTION
inenv.sh
specifies the JMV options for Metastore and HiveServer2.- Do not use
TEZ_USE_MINIMAL
inenv.sh
.
Release 1.6: 2022-12-24
MR3
- Support capacity scheduling with
mr3.dag.queue.capacity.specs
andmr3.dag.queue.name
.
Release 1.5: 2022-7-24
MR3
- Use liveness probes on ContainerWorker Pods running separate processes for shuffle handlers.
- When a ContainerGroup is removed, all its Prometheus metrics are removed.
- Prometheus metrics are correctly published when two DAGAppMaster Pods for Hive and Spark can run concurrently in the same namespace on Kubernetes.
- DAGAppMaster stops if it fails to contact Timeline Server during initialization.
- Introduce
mr3.k8s.master.pod.cpu.limit.multiplier
for a multiplier for the CPU resource limit for DAGAppMaster Pods. - Using MasterControl, autoscaling parameters can be updated dynamically.
- HistoryLogger correctly sends Vertex start times to Timeline Server.
Hive on MR3
- Support Hive 3.1.3.
Spark on MR3
- Support Spark 3.2.2.
- Reduce the size of Protobuf objects when submitting DAGs to MR3.
- Spark executors can run as MR3 ContainerWorkers in local mode.
Release 1.4: 2022-2-14
MR3
- Use Deployment instead of ReplicationController on Kubernetes.
- HistoryLogger correctly sends Vertex finish times to Timeline Server.
- Add more Prometheus metrics.
- Introduce
mr3.application.tags
andmr3.application.scheduling.properties.map
. - The logic for speculative execution uses the average execution time of Tasks (instead of the maximum execution time).
Hive on MR3
- DistCp jobs are sent to MR3, not to Hadoop. As a result, DistCp runs okay on Kubernetes.
- org.apache.tez.common.counters.Limits is initialized in HiveServer2.
- Update Log4j2 to 2.17.1 (for CVE-2021-44228).
Release 1.3: 2021-8-18
MR3
- Separate
mr3.k8s.keytab.secret
andmr3.k8s.worker.secret
. - Introduce
mr3.container.max.num.workers
to limit the number of ContainerWorkers. - Introduce
mr3.k8s.pod.worker.node.affinity.specs
to specify node affinity for ContainerWorker Pods. - No longer use
mr3.convert.container.address.host.name
. - Support ContainerWorker recycling (which is different from ContainerWorker reuse) with
mr3.container.scheduler.scheme
. - Introduce
mr3.am.task.no.retry.errors
to specify the names of errors that prevent the re-execution of Tasks (e.g.,OutOfMemoryError,MapJoinMemoryExhaustionError
). - For reporting to MR3-UI, MR3 uses System.currentTimeMillis() instead of MonotonicClock.
- DAGAppMaster correctly reports to MR3Client the time from DAG submission to DAG execution.
- Introduce
mr3.container.localize.python.working.dir.unsafe
to localize Python scripts in working directories of ContainerWorkers. Localizing Python scripts is an unsafe operation: 1) Python scripts are shared by all DAGs; 2) once localized, Python scripts are not deleted. - The image pull policy specified in
mr3.k8s.pod.image.pull.policy
applies to init containers as well as ContainerWorker containers. - Introduce
mr3.auto.scale.out.num.initial.containers
which specifies the number of new ContainerWorkers to create in a scale-out operation when no ContainerWorkers are running. - Introduce
mr3.container.runtime.auto.start.input
to automatically start LogicalInputs in RuntimeTasks. - Speculative execution works on Vertexes with a single Task.
Hive on MR3
- Metastore correctly uses MR3 for compaction on Kubernetes.
- Auto parallelism is correctly enabled or disabled according to the result of compiling queries by overriding
tez.shuffle-vertex-manager.enable.auto-parallel
, sotez.shuffle-vertex-manager.enable.auto-parallel
can be set to false. - Support the TRANSFORM clause with Python scripts (with
mr3.container.localize.python.working.dir.unsafe
set to true inmr3-site.xml
). - Introduce
hive.mr3.llap.orc.memory.per.thread.mb
to specify the memory allocated to each ORC manager in low-level LLAP I/O threads.
Spark on MR3
- Initial release
Release 1.2: 2020-10-26
MR3
- Introduce
mr3.k8s.pod.worker.init.container.command
to execute a shell command in a privileged init container. - Introduce
mr3.k8s.pod.master.toleration.specs
andmr3.k8s.pod.worker.toleration.specs
to specify tolerations for DAGAppMaster and ContainerWorker Pods. - Setting
mr3.dag.queue.scheme
toindividual
properly implements fair scheduling among concurrent DAGs. - Introduce
mr3.k8s.pod.worker.additional.hostpaths
to mount additional hostPath volumes. mr3.k8s.worker.total.max.memory.gb
andmr3.k8s.worker.total.max.cpu.cores
work okay when autoscaling is enabled.- DAGAppMaster and ContainerWorkers can publish Prometheus metrics.
- The default value of mr3.container.task.failure.num.sleeps is 0.
- Reduce the log size of DAGAppMaster and ContainerWorker.
- TaskScheduler can process about twice as many events (
TaskSchedulerEventTaskAttemptFinished
) per unit time as in MR3 1.1, thus doubling the maximum cluster size that MR3 can manage. - Optimize the use of CodecPool shared by concurrent TaskAttempts.
- The
getDags
command of MasterControl prints both IDs and names of DAGs. - On Kubernetes, the
updateResourceLimit
command of MasterControl updates the limit on the total resources for all ContainerWorker Pods. The user can further improve resource utilization when autoscaling is enabled.
Hive on MR3
- Compute the memory size of ContainerWorker correctly when
hive.llap.io.allocator.mmap
is set to true. - Hive expands all system properties in configuration files (such as core-site.xml) before passing to MR3.
hive.server2.transport.mode
can be set toall
(with HIVE-5312).- MR3 creates three ServiceAccounts: 1) for Metastore and HiveSever2 Pods; 2) for DAGAppMaster Pod; 3) for ContainerWorker Pods. The user can use IAM roles for ServiceAccounts.
- Docker containers start as
root
. Inkubernetes/env.sh
,DOCKER_USER
should be set toroot
and the service principal name inHIVE_SERVER2_KERBEROS_PRINCIPAL
should beroot
. - Support Ranger 2.0.0 and 2.1.0.
Release 1.1: 2020-7-19
MR3
- Support DAG scheduling schemes (specified by
mr3.dag.queue.scheme
). - Optimize DAGAppMaster by freeing memory for messages to Tasks when fault tolerance is disabled (with
mr3.am.task.max.failed.attempts
set to 1). - Fix a minor memory leak in DaemonTask (which also prevents MR3 from running more than 2^30 DAGs when using the shuffle handler).
- Improve the chance of assigning TaskAttempts to ContainerWorkers that match location hints.
- TaskScheduler can use location hints produced by
ONE_TO_ONE
edges. - TaskScheduler can use location hints from HDFS when assigning TaskAttempts to ContainerWorker Pods on Kubernetes (with
mr3.convert.container.address.host.name
). - Introduce
mr3.k8s.pod.cpu.cores.max.multiplier
to specify the multiplier for the limit of CPU cores. - Introduce
mr3.k8s.pod.memory.max.multiplier
to specify the multiplier for the limit of memory. - Introduce
mr3.k8s.pod.worker.security.context.sysctls
to configure kernel parameters of ContainerWorker Pods using init containers. - Support speculative execution of TaskAttempts (with
mr3.am.task.concurrent.run.threshold.percent
). - A ContainerWorker can run multiple shuffle handlers each with a different port. The configuration key
mr3.use.daemon.shufflehandler
now specifies the number of shuffle handlers in each ContainerWorker. - With speculative execution and the use of multiple shuffle handlers in a single ContainerWorker, fetch delays rarely occur.
- A ContainerWorker Pod can run shuffle handlers in a separate container (with
mr3.k8s.shuffle.process.ports
). - On Kubernetes, DAGAppMaster uses ReplicationController instead of Pod, thus making recovery much faster.
- On Kubernetes, ConfigMaps
mr3conf-configmap-master
andmr3conf-configmap-worker
survive MR3, so the user should delete them manually. - Java 8u251/8u252 can be used on Kubernetes 1.17 and later.
Hive on MR3
- CrossProductHandler asks MR3 DAGAppMaster to set
TEZ_CARTESIAN_PRODUCT_MAX_PARALLELISM
(Cf. HIVE-16690, Hive 3/4). - Hive 4 on MR3 is stable (currently using 4.0.0-SNAPSHOT).
- No longer support Hive 1.
- Ranger uses a local directory (emptyDir volume) for logging.
- The open file limit for Solr (in Ranger) is not limited to 1024.
- HiveServer2 and DAGAppMaster create readiness and liveness probes.
Release 1.0: 2020-2-17
MR3
- Support DAG priority schemes (specified by
mr3.dag.priority.scheme
) and Vertex priority schemes (specified bymr3.vertex.priority.scheme
). - Support secure shuffle (using SSL mode) without requiring separate configuration files.
- ContainerWorker tries to avoid OutOfMemoryErrors by sleeping after a TaskAttempt fails (specified by
mr3.container.task.failure.num.sleeps
). - Errors from InputInitializers are properly passed to MR3Client.
- MasterControl supports two new commands for gracefully stopping DAGAppMaster and ContainerWorkers.
Hive on MR3
- Allow fractions for CPU cores (with
hive.mr3.resource.vcores.divisor
). - Support rolling updates.
- Hive on MR3 can access S3 using AWS credentials (with or without Helm).
- On Amazon EKS, the user can use S3 instead of PersistentVolumes on EFS.
- Hive on MR3 can use environment variables
AWS_ACCESS_KEY_ID
andAWS_SECRET_ACCESS_KEY
to access S3 outside Amazon AWS.
Release 0.11: 2019-12-4
MR3
- Support autoscaling.
Hive on MR3
- Memory and CPU cores for Tasks can be set to zero.
- Support autoscaling on Amazon EMR.
- Support autoscaling on Amazon EKS.
Release 0.10: 2019-10-18
MR3
- TaskScheduler supports a new scheduling policy (specified by
mr3.taskattempt.queue.scheme
) which significantly improves the throughput for concurrent queries. - DAGAppMaster recovers from OutOfMemoryErrors due to the exhaustion of threads.
Hive on MR3
- Compaction sends DAGs to MR3, instead of MapReduce, when
hive.mr3.compaction.using.mr3
is set to true. - LlapDecider asks MR3 DAGAppMaster for the number of Reducers.
- ConvertJoinMapJoin asks MR3 DAGAppMaster for the currrent number of Nodes to estimate the cost of Bucket Map Join.
- Support Hive 3.1.2 and 2.3.6.
- Support Helm charts.
- Compaction works okay on Kubernetes.
Release 0.9: 2019-7-25
MR3
- Each DAG uses its own ClassLoader.
Hive on MR3
- LLAP I/O works properly on Kubernetes.
- UDFs work okay on Kubernetes.
Release 0.8: 2019-6-22
MR3
- A new DAGAppMaster properly recovers DAGs that have not been completed in the previous DAGAppMaster.
- Fault tolerance after fetch failures works much faster.
- On Kubernetes, the shutdown handler of DAGAppMaster deletes all running Pods.
- On both Yarn and Kubernetes, MR3Client automatically connects to a new DAGAppMaster after an initial DAGAppMaster is killed.
Hive on MR3
- Hive 3 for MR3 supports high availability on Yarn via ZooKeeper.
- On both Yarn and Kubernetes, multiple HiveServer2 instances can share a common MR3 DAGAppMaster (and thus all its ContainerWorkers as well).
- Support Apache Ranger on Kubernetes.
- Support Timeline Server on Kubernetes.
Release 0.7: 2019-4-26
MR3
- Resolve deadlock when Tasks fail or ContainerWorkers are killed.
- Support fault tolerance after fetch failures.
- Support node blacklisting.
Hive on MR3
- Introduce a new configuration key
hive.mr3.am.task.max.failed.attempts
. - Apply HIVE-20618.
Release 0.6: 2019-3-21
MR3
- DAGAppMaster can run in its own Pod on Kubernetes.
- Support elastic execution of RuntimeTasks in ContainerWorkers.
- MR3-UI requires only Timeline Server.
Hive on MR3
- Support memory monitoring when loading hash tables for Map-side join.
Release 0.5: 2019-2-18
MR3
- Support Kubernetes.
- Support the use of the built-in shuffle handler.
Hive on MR3
- Support Hive 3.1.1 and 2.3.5.
- Initial release for Hive on MR3 on Kubernetes
Release 0.4: 2018-10-29
MR3
- Support auto parallelism for reducers with
ONE_TO_ONE
edges. - Auto parallelism can use input statistics when reassigning partitions to reducers.
- Support ByteBuffer sharing among RuntimeTasks.
Hive on MR3
- Support Hive 3.1.0.
- Hive 1 uses Tez 0.9.1.
- Metastore checks the inclusion of
__HIVE_DEFAULT_PARTITION__
when retrieving column statistics. - MR3JobMonitor returns immediately from MR3 DAGAppMaster when the DAG completes.
Release 0.3: 2018-8-15
MR3
- Extend the runtime to support Hive 3.
Hive on MR3
- Support Hive 3.0.0.
- Support query re-execution.
- Support per-query cache in Hive 2 and 3.
Release 0.2: 2018-5-18
MR3
- Support asynchronous logging (with
mr3.async.logging
inmr3-site.xml
). - Delete DAG-local directories after each DAG is finished.
Hive on MR3
- Support LLAP I/O for Hive 2.
- Support Hive 2.2.0.
- Use Hive 2.3.3 instead of Hive 2.3.2.
Release 0.1: 2018-3-31
MR3
- Initial release
Hive on MR3
- Initial release
Patches backported in MR3 1.11
- HIVE-27600 Reduce filesystem calls in OrcFileMergeOperator
- HIVE-25561 Killed task should not commit file
Patches backported in MR3 1.9
- HIVE-27876 Incorrect query results on tables with ClusterBy & SortBy
- HIVE-27788 Exception when join has 2 Group By operators in the same branch in the same reducer
- HIVE-27777 CBO fails on multi insert overwrites with common group expression
- HIVE-24606 Multi-stage materialized CTEs can lose intermediate data
- HIVE-27494 Deduplicate the task result that generated by more branches in union all
- HIVE-26968 Wrong results when shared work optimizer merges TS operator with different DPP edges
- HIVE-25751 Ignore exceptions related to interruption when the limit is reached
- HIVE-25274 TestLimitOperator fails if default engine is Tez
- HIVE-24207 LimitOperator can leverage ObjectCache to bail out quickly
Patches backported in MR3 1.8
- HIVE-21923 Vectorized MapJoin may miss results when only the join key is selected
- HIVE-21288 Runtime rowcount calculation is incorrect in vectorized executions
- HIVE-18908 FULL OUTER JOIN to MapJoin
- HIVE-4605 Hive job fails while closing reducer output - Unable to rename
- HIVE-27344 ORC RecordReaderImpl throws NPE when close() is called from the constructor
- HIVE-27649 Support ORDER BY clause in subqueries with set operators
- HIVE-27303 Set correct output name to ReduceSink when there is a SMB join after Union
- HIVE-27437 Vectorization: VectorizedOrcRecordReader does not reset VectorizedRowBatch after processing
- HIVE-24485 Make the slow-start behavior tunable
- HIVE-25874 Slow filter evaluation of nest struct fields in vectorized executions
- HIVE-25960 Use RemoteIteratorWithFilter.HIDDEN_FILES_FULL_PATH_FILTER defined in org.apache.hadoop.hive.metastore.utils.FileUtils
- HIVE-25754 Fix column projection for union all queries with multiple aliases
- HIVE-25683 Close reader in AcidUtils.isRawFormatFile
- HIVE-26184 COLLECT_SET with GROUP BY is very slow when some keys are highly skewed
- HIVE-25794 CombineHiveRecordReader: log statements in a loop leads to memory pressure
- HIVE-25736 Close ORC readers
- HIVE-25685 HBaseStorageHandler: ensure that hbase properties are present in final JobConf for Tez
- HIVE-25577 unix_timestamp() is ignoring the time zone value
- HIVE-25549 Wrong results for window function with expression in PARTITION BY or ORDER BY clause
- HIVE-25449 datediff() gives wrong output when run in a tez task with some non-UTC timezone
- HIVE-25458 Unix_timestamp() with string input give wrong result
- HIVE-25403 Fix from_unixtime() to consider leap seconds
- HIVE-25058 PTF: TimestampValueBoundaryScanner can be optimised during range computation pt2 - isDistanceGreater
- HIVE-25299 Casting timestamp to numeric data types is incorrect for non-UTC timezones
- HIVE-25085 MetaStore Clients no longer shared across sessions
- HIVE-25093 date_format() UDF is returning output in UTC time zone only
- HIVE-25001 Improvement for some debug-logging guards
- HIVE-24746 PTF: TimestampValueBoundaryScanner can be optimised during range computation
- HIVE-24882 Compaction task reattempt fails with FileAlreadyExistsException for DeleteEventWriter
- HIVE-24858 UDFClassLoader leak in Configuration.CACHE_CLASSES
- HIVE-24808 Cache Parsed Dates
- HIVE-24693 Convert timestamps to zoned times without string operations
- HIVE-24353 Performance: do not throw exceptions when parsing Timestamp
- HIVE-24478 Subquery GroupBy with Distinct SemanticException: Invalid column reference
- HIVE-24691 Ban commons-logging
- HIVE-24660 Remove Commons Logger from jdbc-handler Package
- HIVE-24659 Remove Commons Logger from serde Package
- HIVE-24613 Support Values clause without Insert
- HIVE-24435 Vectorized unix_timestamp is inconsistent with non-vectorized counterpart
- HIVE-24179 Memory leak in HS2 DbTxnManager when compiling SHOW LOCKS statement
- HIVE-24373 Wrong predicate is pushed down for view with constant value projection.
- HIVE-24236 Fixed possible Connection leaks in TxnHandler
- HIVE-24199 Incorrect result when subquey in exists contains limit
- HIVE-24036 Kryo Exception while serializing plan for getSplits UDF call
- HIVE-23837 Configure StorageHandlers if FileSinkOperator is child of MergeJoinWork
- HIVE-23265 Duplicate rowsets are returned with Limit and Offset set
- HIVE-22684 Run Eclipse Cleanup Against hbase-handler Module
- HIVE-23058 Compaction task reattempt fails with FileAlreadyExistsException
- HIVE-22566 Drop table involved in materialized view leaves the table in inconsistent state
- HIVE-23307 Cache ColumnIndex in HiveBaseResultSet
- HIVE-22707 MergeJoinWork should be considered while collecting DAG credentials
- HIVE-22614 Replace Base64 in hive-hbase-handler Package
- HIVE-22392 Hive JDBC Storage Handler: Support For Writing Data to JDBC Data Source
- HIVE-22227 Tez bucket pruning produces wrong result with shared work optimization
- HIVE-22107 Correlated subquery producing wrong schema
- HIVE-21862 ORC ppd produces wrong result with timestamp
- HIVE-22008 LIKE Operator should match multi-line input
- HIVE-15406 Consider vectorizing the new trunc function
- HIVE-21384 Upgrade to dbcp2 in JDBC storage handler
- HIVE-21253 Support DB2 in JDBC StorageHandler
- HIVE-20484 Disable Block Cache By Default With HBase SerDe
- HIVE-20955 Fix Calcite Rule HiveExpandDistinctAggregatesRule seems throwing IndexOutOfBoundsException
- HIVE-21026 Druid Vectorize Reader is not using the correct input size
- HIVE-20842 Fix logic introduced in HIVE-20660 to estimate statistics for group by
- HIVE-20660 Group by statistics estimation could be improved by bounding the total number of rows to source table
- HIVE-20684 Make compute stats work for Druid tables
Patches backported in MR3 1.7
- HIVE-20615 CachedStore: Background refresh thread bug fixes
- HIVE-21479 NPE during metastore cache update
- HIVE-27267 choose correct partition columns from bigTableRS
- HIVE-27138 Extend RSOp to compute filterTag if it has child MapJoinOp
- HIVE-27184 Add class name profiling option in ProfileServlet
- HIVE-23891 UNION ALL and multiple task attempts can cause file duplication
- HIVE-22173 Query with multiple lateral views hangs during compilation
- HIVE-26882 Allow transactional check of Table parameter before altering the Table
- HIVE-26683 Sum windowing function returns wrong value when all nulls
- HIVE-26779 UNION ALL throws SemanticException when trying to remove partition predicates: fail to find child from parent
- HIVE-27069 Incorrect results with bucket map join
- HIVE-26676 Count distinct in subquery returning wrong results
- HIVE-26235 OR Condition on binary column is returning empty result
- HIVE-25758 OOM due to recursive application of CBO rules
- HIVE-25909 Add test for ‘hive.default.nulls.last’ property for windows with ordering
- HIVE-25917 Use default value for ‘hive.default.nulls.last’ when config is not available
- HIVE-25864 Hive query optimisation creates wrong plan for predicate pushdown with windowing function
- HIVE-25822 Unexpected result rows in case of outer join contains conditions only affecting one side
- HIVE-24073 Execution exception in sort-merge semijoin
- HIVE-21935 Hive Vectorization : degraded performance with vectorize UDF
- HIVE-24827 Hive aggregation query returns incorrect results for non text files
- HIVE-14165 Remove unnecessary file listing from FetchOperator
- HIVE-24245 Vectorized PTF with count and distinct over partition producing incorrect results.
- HIVE-24113 NPE in GenericUDFToUnixTimeStamp
- HIVE-24293 Integer overflow in llap collision mask
- HIVE-24209 Incorrect search argument conversion for NOT BETWEEN operation when vectorization is enabled
- HIVE-24023 Hive parquet reader can’t read files with length=0
- HIVE-23873 Querying Hive JDBCStorageHandler table fails with NPE when CBO is off
- HIVE-23751 QTest: Override #mkdirs() method in ProxyFileSystem To Align After HADOOP-16582
- HIVE-23774 Reduce log level at aggrColStatsForPartitions in MetaStoreDirectSql.java
- HIVE-23738 DBLockManager::lock() : Move lock request to debug level
- HIVE-23706 Fix nulls first sorting behavior
- HIVE-23592 Routine makeIntPair is Not Correct
- HIVE-19653 Incorrect predicate pushdown for groupby with grouping sets
- HIVE-23435 Full outer join result is missing rows
- HIVE-22903 Vectorized row_number() resets the row number after one batch in case of constant expression in partition clause
- HIVE-22840 Race condition in formatters of TimestampColumnVector and DateColumnVector
- HIVE-22898 CharsetDecoder race condition in OrcRecordUpdater
- HIVE-22629 Avoid Array resizing/recreation by using the construcor/ref instead of iteration/get_i
- HIVE-22321 Setting default nulls last does not take effect when order direction is specified
- HIVE-21338 Remove order by and limit for aggregates
- HIVE-20796 jdbc URL can contain sensitive information that should not be logged
- HIVE-20423 Set NULLS LAST as the default null ordering
- HIVE-20246 Configurable collecting stats by using DO_NOT_UPDATE_STATS table property
- HIVE-20331 Query with union all, lateral view and Join fails with “cannot find parent in the child operator”
- HIVE-20202 Add profiler endpoint to HS2 and LLAP
Patches backported in MR3 1.6
- HIVE-21935 Hive Vectorization : degraded performance with vectorize UDF
- HIVE-24113 NPE in GenericUDFToUnixTimeStamp
- HIVE-24293 Integer overflow in llap collision mask
- HIVE-23501 AOOB in VectorDeserializeRow when complex types are converted to primitive types
- HIVE-23688 fix Vectorization IndexArrayOutOfBoundsException when read null values in map
- HIVE-26292 GroupByOperator initialization does not clean state
- HIVE-26743 backport HIVE-24694: Early connection close to release server resources during creating
- HIVE-26532 Remove logger from critical path in VectorMapJoinInnerLongOperator::processBatch
- HIVE-25960 Fix S3a recursive listing logic
- HIVE-24391 Fix FIX TestOrcFile failures in branch-3.1
- HIVE-24316 Upgrade ORC from 1.5.6 to 1.5.8 in branch-3.1
- HIVE-26447 Vectorization: wrong results when filter on repeating map key orc table
- HIVE-25446 Wrong exception thrown if capacity<=0
- HIVE-24849 Create external table socket timeout when location has large number of files
- HIVE-25505 Incorrect results with header. skip.header.line.count if first line is blank
- HIVE-18284 Fix NPE when inserting data with ‘distribute by’ clause with dynpart sort optimization
- HIVE-24381 Compressed text input returns 0 rows if skip header/footer is mentioned
- HIVE-23140 Optimise file move in CTAS
- HIVE-22814 ArrayIndexOutOfBound in the vectorization getDataTypePhysicalVariation
- HIVE-22805 Vectorization with conditional array or map is not implemented and throws an error
- HIVE-21341 Sensible defaults : hive.server2.idle.operation.timeout and hive.server2.idle.session.timeout are too high
- HIVE-22784 Boundary condition to check if there is nothing to truncate in StringExpr functions
- HIVE-22770 Skip interning of MapWork fields during deserialization
- HIVE-22733 After disable operation log property in hive, still HS2 saving the operation log
- HIVE-22713 Constant propagation shouldn’t be done for Join-Fil-RS structure
- HIVE-22720 Remove Log from HiveConf::getLogIdVar
- HIVE-22405 Add ColumnVector support for ProlepticCalendar
- HIVE-22033 HiveServer2: fix delegation token renewal
- HIVE-22631 Avoid deep copying partition list in listPartitionsByExpr
- HIVE-22548 Optimise Utilities.removeTempOrDuplicateFiles when moving files to final location
- HIVE-22632 Improve estimateRowSizeFromSchema
- HIVE-22599 Query results cache: 733 permissions check is not necessary
- HIVE-22625 Syntax Error in findPotentialCompactions SQL query for MySql/Postgres
- HIVE-22551 BytesColumnVector initBuffer should clean vector and length consistently
- HIVE-22523 The error handler in LlapRecordReader might block if its queue is full
- HIVE-22505 ClassCastException caused by wrong Vectorized operator selection
- HIVE-21917 COMPLETED_TXN_COMPONENTS table is never cleaned up unless Compactor runs
- HIVE-22411 Performance degradation on single row inserts
- HIVE-22357 Schema mismatch between the Hive table definition and the ‘hive.sql.query’ parameter
- HIVE-21457 Perf optimizations in ORC split-generation
- HIVE-21390 BI split strategy does not work for blob stores
Patches backported to Hive 3.1.0 in MR3 1.5
- HIVE-24948 Enhancing performance of OrcInputFormat.getSplits with bucket pruning
- HIVE-20001 With doas set to true, running select query as hrt_qa user on external table fails due to permission denied to read /warehouse/tablespace/managed directory.
- HIVE-19748 Add appropriate null checks to DecimalColumnStatsAggregator
- HIVE-25665 Checkstyle LGPL files must not be in the release sources/binaries
- HIVE-25522 NullPointerException in TxnHandler
- HIVE-25547 Alter view as Select statement should create Authorizable events in HS2
- HIVE-25726 Upgrade velocity to 2.3 due to CVE-2020-13936
- HIVE-25839 Upgrade Log4j2 to 2.17.1 due to CVE-2021-44832
- HIVE-25600 Compaction job creates redundant base/delta folder within base/delta folder
- HIVE-24324 Remove deprecated API usage from Avro
- HIVE-24851 Fix reader leak in AvroGenericRecordReader
- HIVE-24797 Disable validate default values when parsing Avro schemas
- HIVE-24964 Backport HIVE-22453 to branch-3.1
- HIVE-24788 Backport HIVE-23338 to branch-3.1
- HIVE-23338 Bump jackson version to 2.10.0
- HIVE-24747 Backport HIVE-24569 to branch-3.1
- HIVE-24653 Race condition between compactor marker generation and get splits.
- HIVE-22981 DataFileReader is not closed in AvroGenericRecordReader#extractWriterTimezoneFromMetadata
- HIVE-24436 Fix Avro NULL_DEFAULT_VALUE compatibility issue
- HIVE-25277 fix slow partition deletion issue by removing duplicated isEmpty checks
- HIVE-25170 Fix wrong colExprMap generated by SemanticAnalyzer
- HIVE-24224 Fix skipping header/footer for Hive on Tez on compressed file
- HIVE-24093 Remove unused hive.debug.localtask
- HIVE-22412 StatsUtils throw NPE when explain
- HIVE-23509 Fixing MapJoin Capacity Assertion Error
- HIVE-23625 use html file extension for HS2 web UI query_page
- HIVE-22476 Hive datediff function provided inconsistent results when hive.fetch.task.conversion is set to none
- HIVE-22769 Incorrect query results and query failure during split generation for compressed text files
- HIVE-5312 Let HiveServer2 run simultaneously in HTTP (over thrift) and Binary (normal thrift transport) mode
- HIVE-22948 QueryCache: Treat query cache locations as temporary storage
- HIVE-22762 Leap day is incorrectly parsed during cast in Hive
- HIVE-22763 0 is accepted in 12-hour format during timestamp cast
- HIVE-22653 Remove commons-lang leftovers
- HIVE-22685 Fix TestHiveSqlDateTimeFormatter To Work With New Year 2020
- HIVE-22511 Fix case of Month token in datetime to string conversion
- HIVE-22422 Missing documentation from HiveSqlDateTimeFormatter: list of date-based patterns
- HIVE-21580 Introduce ISO 8601 week numbering SQL:2016 formats
- HIVE-21579 Introduce more complex SQL:2016 datetime formats
- HIVE-21578 Introduce SQL:2016 formats FM, FX, and nested strings
- HIVE-22945 Hive ACID Data Corruption: Update command mess the other column data and produces incorrect result
- HIVE-21660 Wrong result when union all and later view with explode is used
- HIVE-22891 Skip PartitionDesc Extraction In CombineHiveRecord For Non-LLAP Execution Mode
- HIVE-22815 reduce the unnecessary file system object creation in MROutput
- HIVE-22753 Fix gradual mem leak: Operationlog related appenders should be cleared up on errors
- HIVE-22400 UDF minute with time returns NULL
- HIVE-22700 Compactions may leak memory when unauthorized
- HIVE-22485 Cross product should set the conf in UnorderedPartitionedKVEdgeConfig
- HIVE-22532 PTFPPD may push limit incorrectly through Rank/DenseRank function
- HIVE-22507 KeyWrapper comparator create field comparator instances at every comparison
- HIVE-22435 Exception when using VectorTopNKeyOperator operator
- HIVE-22513 Constant propagation of casted column in filter ops can cause incorrect results
- HIVE-22464 Implement support for NULLS FIRST/LAST in TopNKeyOperator
- HIVE-22433 Hive JDBC Storage Handler: Incorrect results fetched from BOOLEAN and TIMESTAMP DataType From JDBC Data Source
- HIVE-22421 Improve Logging If Configuration File Not Found
- HIVE-22425 ReplChangeManager Not Debug Logging Database Name
- HIVE-22431 Hive JDBC Storage Handler: java.lang.ClassCastException on accessing TINYINT, SMALLINT Data Type From JDBC Data Source
- HIVE-22406 TRUNCATE TABLE fails due MySQL limitations on limit value
- HIVE-22315 Support Decimal64 column division with decimal64 scalar
- HIVE-18415 Lower ‘Updating Partition Stats’ Logging Level
- HIVE-22398 Remove legacy code that can cause issue with new Yarn releases
- HIVE-22330 Maximize smallBuffer usage in BytesColumnVector
- HIVE-22360 MultiDelimitSerDe returns wrong results in last column when the loaded file has more columns than those in table schema
- HIVE-22391 NPE while checking Hive query results cache
- HIVE-22373 File Merge tasks fail when containers are reused
- HIVE-22336 Updates should be pushed to the Metastore backend DB before creating the notification event
- HIVE-22332 Hive should ensure valid schema evolution settings since ORC-540
- HIVE-21407 Parquet predicate pushdown is not working correctly for char column types
- HIVE-22331 unix_timestamp without argument returns timestamp in millisecond instead of second
- HIVE-21407 Parquet predicate pushdown is not working correctly for char column types
- HIVE-14302 Tez: Optimized Hashtable can support DECIMAL keys of same precision
- HIVE-21924 Split text files even if header/footer exists
- HIVE-22270 Upgrade commons-io to 2.6
- HIVE-22278 Upgrade log4j to 2.12.1
- HIVE-22248 Fix statistics persisting issues
- HIVE-22275 OperationManager.queryIdOperation does not properly clean up multiple queryIds
- HIVE-22207 Tez SplitGenerator throws NumberFormatException when “dfs.blocksize” on cluster is “128m”
- HIVE-22273 Access check is failed when a temporary directory is removed
- HIVE-21987 Hive is unable to read Parquet int32 annotated with decimal
- HIVE-22208 Column name with reserved keyword is unescaped when query including join on table with mask column is re-written
- HIVE-22232 NPE when hive.order.columnalignment is set to false
- HIVE-22243 Align Apache Thrift version to 0.9.3-1 in standalone-metastore as well
- HIVE-22197 Common Merge join throwing class cast exception.
- HIVE-22079 Post order walker for iterating over expression tree
- HIVE-22231 Hive query with big size via knox fails with Broken pipe Write failed
- HIVE-22145 Avoid optimizations for analyze compute statistics
- HIVE-20113 Shuffle avoidance: Disable 1-1 edges for sorted shuffle
- HIVE-22219 Bringing a node manager down blocks restart of LLAP service
- HIVE-22201 ConvertJoinMapJoin#checkShuffleSizeForLargeTable throws ArrayIndexOutOfBoundsException if no big table is selected
- HIVE-22210 Vectorization may reuse computation output columns involved in filtering
- HIVE-22170 from_unixtime and unix_timestamp should use user session time zone
- HIVE-22059 hive-exec jar doesn’t contain (fasterxml) jackson library
- HIVE-22182 SemanticAnalyzer populates map which is not used at all
- HIVE-22169 Tez: SplitGenerator tries to look for plan files which won’t exist for Tez
- HIVE-22200 Hash collision may cause column resolution to fail
- HIVE-22055 select count gives incorrect result after loading data from text file
- HIVE-22204 Beeline option to show/not show execution report
- HIVE-15956 StackOverflowError when drop lots of partitions
- HIVE-22164 Vectorized Limit operator returns wrong number of results with offset
- HIVE-22164 Vectorized Limit operator returns wrong number of results with offset
- HIVE-21397 BloomFilter for hive Managed [ACID] table does not work as expected
- HIVE-22178 Parquet FilterPredicate throws CastException after SchemaEvolution
- HIVE-22106 Remove cross-query synchronization for the partition-eval
- HIVE-22168 Remove very expensive logging from the llap cache hotpath
- HIVE-22099 Several date related UDFs can’t handle Julian dates properly since HIVE-20007
- HIVE-22161 UDF: FunctionRegistry synchronizes on org.apache.hadoop.hive.ql.udf.UDFType class
- HIVE-22151 Turn off hybrid grace hash join by default
- HIVE-22102 Reduce HMS call when creating HiveSession
- HIVE-22148 S3A delegation tokens are not added in the job config of the Compactor.
- HIVE-22121 Turning on hive.tez.bucket.pruning produce wrong results
- HIVE-22134 HIVE-22129: Remove glassfish.jersey and mssql-jdbc classes from jdbc-standalone jar
- HIVE-21698 TezSessionState#ensureLocalResources() causes IndexOutOfBoundsException while localizing resources
- HIVE-22132 Upgrade commons-lang3 version to 3.9
- HIVE-22114 insert query for partitioned insert only table failing when all buckets are empty
- HIVE-22092 Fetch is failing with IllegalArgumentException: No ValidTxnList when refetch is done.
- HIVE-22094 queries failing with ClassCastException: hive.ql.exec.vector.DecimalColumnVector cannot be cast to hive.ql.exec.vector.Decimal64ColumnVector
- HIVE-21241 Migrate TimeStamp Parser From Joda Time
- HIVE-16587 NPE when inserting complex types with nested null values
- HIVE-22080 Prevent implicit conversion from String/char/varchar to double/decimal
- HIVE-22040 Drop partition throws exception with ‘Failed to delete parent: File does not exist’ when the partition’s parent path does not exists
- HIVE-21970 Avoid using RegistryUtils.currentUser()
- HIVE-21828 Tez: Use a pre-parsed TezConfiguration from DagUtils - Addendum2
- HIVE-21828 Tez: Use a pre-parsed TezConfiguration from DagUtils - Addendum
- HIVE-21828 Tez: Use a pre-parsed TezConfiguration from DagUtils
- HIVE-22115 Prevent the creation of query routing appender if property is set to false
- HIVE-22120 Fix wrong results/ArrayOutOfBound exception in left outer map joins on specific boundary conditions
- HIVE-22113 Prevent LLAP shutdown on AMReporter related RuntimeException
- HIVE-13457 Create HS2 REST API endpoints for monitoring information
- HIVE-22054 Avoid recursive listing to check if a directory is empty
- HIVE-22045 HIVE-21711 introduced regression in data load
- HIVE-21173 Upgrade Apache Thrift to 0.9.3-1
- HIVE-22009 CTLV with user specified location is not honoured.
- HIVE-21711 Regression caused by HIVE-21279 for blobstorage fs
- HIVE-21986 HiveServer Web UI: Setting the Strict-Transport-Security in default response header
- HIVE-21972 “show transactions” display the header twice
- HIVE-21973 SHOW LOCKS prints the headers twice
- HIVE-21224 Upgrade tests JUnit3 to JUnit4
- HIVE-21976 Offset should be null instead of zero in Calcite HiveSortLimit
- HIVE-21868 Vectorize CAST…FORMAT
- HIVE-21928 Fix for statistics annotation in nested AND expressions
- HIVE-21915 Hive with TEZ UNION ALL and UDTF results in data loss
- HIVE-19831 Hiveserver2 should skip doAuth checks for CREATE DATABASE/TABLE if database/table already exists
- HIVE-21927 HiveServer Web UI: Setting the HttpOnly option in the cookies
- HIVE-15177 Authentication with hive fails when kerberos auth type is set to fromSubject and principal contains _HOST
- HIVE-21905 Generics improvement around the FetchOperator class
- HIVE-14737 Problem accessing /logs in a Kerberized Hive Server 2 Web UI
- HIVE-21902 HiveServer2 UI: jetty response header needs X-Frame-Options
- HIVE-21746 ArrayIndexOutOfBoundsException during dynamically partitioned hash join, with CBO disabled
- HIVE-21576 Introduce CAST…FORMAT and limited list of SQL:2016 datetime formats
- HIVE-19661 switch Hive UDFs to use Re2J regex engine
- HIVE-21835 Unnecessary null checks in org.apache.hadoop.hive.ql.optimizer.StatsOptimizer
- HIVE-21815 Stats in ORC file are parsed twice
- HIVE-21799 NullPointerException in DynamicPartitionPruningOptimization, when join key is on aggregation column
- HIVE-21796 ArrayWritableObjectInspector.equals can take O(2^nesting_depth) time
- HIVE-21837 MapJoin is throwing exception when selected column is having completely null values
- HIVE-21742 Vectorization: CASE result type casting
- HIVE-21805 HiveServer2: Use the fast ShutdownHookManager APIs
- HIVE-21834 Avoid unnecessary calls to simplify filter conditions
- HIVE-21795 Rollup summary row might be missing when a mapjoin is happening on a partitioned table
- HIVE-21789 HiveFileFormatUtils.getRecordWriter is unnecessary
- HIVE-21768 JDBC: Strip the default union prefix for un-enclosed UNION queries
- HIVE-21686 ensure that memory allocator does not evict using brute foce path.
- HIVE-21717 Rename is failing for directory in move task.
- HIVE-21681 Describe formatted shows incorrect information for multiple primary keys
- HIVE-21240 JSON SerDe Re-Write and Fixup timestamp parsing issue
- HIVE-21700 Hive incremental load going OOM while adding load task to the leaf nodes of the DAG.
- HIVE-21694 Hive driver wait time is fixed for task getting executed in parallel.
- HIVE-21685 Wrong simplification in query with multiple IN clauses
- HIVE-19353 Vectorization: ConstantVectorExpression –> RuntimeException: Unexpected column vector type LIST
- HIVE-21675 CREATE VIEW IF NOT EXISTS broken
- HIVE-21669 HS2 throws NPE when HiveStatement.getQueryId is invoked and query is closed concurrently
- HIVE-21061 CTAS query fails with IllegalStateException for empty source
- HIVE-21400 Vectorization: LazyBinarySerializeWrite allocates Field() within the loop
- HIVE-21531 Vectorization: all NULL hashcodes are not computed using Murmur3
- HIVE-21647 Disable TestReplAcidTablesWithJsonMessage and TestReplicationScenariosAcidTables
- HIVE-18702 INSERT OVERWRITE TABLE doesn’t clean the table directory before overwriting
- HIVE-21573 Binary transport shall ignore principal if auth is set to delegationToken
- HIVE-21372 Use Apache Commons IO To Read Stream To String
- HIVE-21509 LLAP may cache corrupted column vectors and return wrong query result
- HIVE-21386 Extend the fetch task enhancement done in HIVE-21279 to make it work with query result cache
- HIVE-21377 Using Oracle as HMS DB with DirectSQL
- HIVE-21499 should not remove the function from registry if create command failed with AlreadyExistsException
- HIVE-21518 GenericUDFOPNotEqualNS does not run in LLAP
- HIVE-21402 Compaction state remains ‘working’ when major compaction fails
- HIVE-21230 LEFT OUTER JOIN does not generate transitive IS NOT NULL filter on right side
- HIVE-21316 Comparision of varchar column and string literal should happen in varchar
- HIVE-21517 Fix AggregateStatsCache
- HIVE-21544 Constant propagation corrupts coalesce/case/when expressions during folding
- HIVE-21467 Remove deprecated junit.framework.Assert imports
- HIVE-21455 Too verbose logging in AvroGenericRecordReader
- HIVE-21478 Metastore cache update shall capture exception
- HIVE-21496 Automatic sizing of unordered buffer can overflow
- HIVE-21048 Remove needless org.mortbay.jetty from hadoop exclusions
- HIVE-21183 Interrupt wait time for FileCacheCleanupThread
- HIVE-21460 ACID: Load data followed by a select * query results in incorrect results
- HIVE-21468 Case sensitivity in identifier names for JDBC storage handler
- HIVE-16924 Support distinct in presence of Group By
- HIVE-21368 Vectorization: Unnecessary Decimal64 -> HiveDecimal conversion
- HIVE-21336 Creation of PCS_STATS_IDX fails Oracle when NLS_LENGTH_SEMANTICS=char
- HIVE-21421 HiveStatement.getQueryId throws NPE when query is not running
- HIVE-21371 Make NonSyncByteArrayOutputStream Overflow Conscious
- HIVE-21264 Improvements Around CharTypeInfo
- HIVE-21339 LLAP: Cache hit also initializes an FS object
- HIVE-20656 Sensible defaults: Map aggregation memory configs are too aggressive
- HIVE-19968 UDF exception is not throw out
- HIVE-21294 Vectorization: 1-reducer Shuffle can skip the object hash functions
- HIVE-21182 Skip setting up hive scratch dir during planning
- HIVE-21279 Avoid moving/rename operation in FileSink op for SELECT queries
- HIVE-18920 CBO: Initialize the Janino providers ahead of 1st query
- HIVE-21363 Ldap auth issue: group filter match should be case insensitive
- HIVE-21270 A UDTF to show schema (column names and types) of given query
- HIVE-21329 Custom Tez runtime unordered output buffer size depending on operator pipeline
- HIVE-21297 Replace all occurences of new Long, Boolean, Double etc with the corresponding .valueOf
- HIVE-21306 Upgrade HttpComponents to the latest versions similar to what Hadoop has done
- HIVE-21308 Negative forms of variables are not supported in HPL/SQL
- HIVE-21295 StorageHandler shall convert date to string using Hive convention
- HIVE-21296 Dropping varchar partition throw exception
- HIVE-18890 Lower Logging for “Table not found” Error
- HIVE-685 add UDFquote
- HIVE-21252 [Trivial] Use String.equals in LazyTimestamp
- HIVE-21228 Replace all occurences of new Integer with Integer.valueOf
- HIVE-21223 CachedStore returns null partition when partition does not exist
- HIVE-21206 Bootstrap replication is slow as it opens lot of metastore connections
- HIVE-21009 Adding ability for user to set bind user
- HIVE-21009 Adding ability for user to set bind user
- HIVE-21199 Replace all occurences of new Byte with Byte.valueOf
- HIVE-20295 Remove !isNumber check after failed constant interpretation
- HIVE-20894 Clean Up JDBC HiveQueryResultSet
- HIVE-21188 SemanticException for query on view with masked table
- HIVE-21171 Skip creating scratch dirs for tez if RPC is on
- HIVE-17020 Aggressive RS dedup can incorrectly remove OP tree branch
- HIVE-11708 Logical operators raises ClassCastExceptions with NULL
- HIVE-21134 Hive Build Version as UDF
- HIVE-21148 Remove Use StandardCharsets Where Possible
- HIVE-21138 Fix some of the alerts raised by lgtm.com
- HIVE-16907 “INSERT INTO” overwrite old data when destination table encapsulated by backquote
- HIVE-20419 Vectorization: Prevent mutation of VectorPartitionDesc after being used in a hashmap key
- HIVE-20879 Using null in a projection expression leads to CastException
- HIVE-21099 Do Not Print StackTraces to STDERR in ConditionalResolverMergeFiles
- HIVE-21107 Cannot find field” error during dynamically partitioned hash join
- HIVE-20170 Improve JoinOperator “rows for join key” Logging
- HIVE-21124 HPL/SQL does not support the CREATE TABLE LIKE statement
- HIVE-21095 Show create table should not display a time zone for timestamp with local time zone
- HIVE-21104 PTF with nested structure throws ClassCastException
- HIVE-21113 For HPL/SQL that contains boolean expression with NOT, incorrect SQL may be generated
- HIVE-21082 In HPL/SQL, declare statement does not support variable of type character
- HIVE-20159 Do Not Print StackTraces to STDERR in ConditionalResolverSkewJoin
- HIVE-21085 Materialized views registry starts non-external tez session
- HIVE-21073 Remove Extra String Object
- HIVE-20160 Do Not Print StackTraces to STDERR in OperatorFactory
- HIVE-21033 Forgetting to close operation cuts off any more HiveServer2 output
- HIVE-20989 JDBC - The GetOperationStatus + log can block query progress via sleep()
- HIVE-20748 Disable materialized view rewriting when plan pattern is not allowed
- HIVE-21040 msck does unnecessary file listing at last level of directory tree
- HIVE-21041 NPE, ParseException in getting schema from logical plan
- HIVE-20785 Wrong key name in the JDBC DatabaseMetaData.getPrimaryKeys method
- HIVE-21021 Scalar subquery with only aggregate in subquery (no group by) has unnecessary sq_count_check branch
- HIVE-21028 Adding a JDO fetch plan for getTableMeta get_table_meta to avoid race condition
- HIVE-20961 Retire NVL implementation
- HIVE-21005 LLAP: Reading more stripes per-split leaks ZlibCodecs
- HIVE-21018 Grouping/distinct on more than 64 columns should be possible
- HIVE-20979 Fix memory leak in hive streaming
- HIVE-21013 JdbcStorageHandler fail to find partition column in Oracle
- HIVE-20985 If select operator inputs are temporary columns vectorization may reuse some of them as output
- HIVE-20827 Inconsistent results for empty arrays
- HIVE-20953 Remove a function from function registry when it can not be added to the metastore when creating it.
- HIVE-20981 streaming/AbstractRecordWriter leaks HeapMemoryMonitor
- HIVE-18902 Lower Logging Level for Cleaning Up “local RawStore”
- HIVE-19403 Demote ‘Pattern’ Logging
- HIVE-19846 Removed Deprecated Calls From FileUtils-getJarFilesByPath
- HIVE-20161 Do Not Print StackTraces to STDERR in ParseDriver
- HIVE-20239 Do Not Print StackTraces to STDERR in MapJoinProcessor
- HIVE-20831 Add Session ID to Operation Logging
- HIVE-20978 “hive.jdbc.*” should add to sqlStdAuthSafeVarNameRegexes
- HIVE-20976 JDBC queries containing joins gives wrong results
- HIVE-20873 Use Murmur hash for VectorHashKeyWrapperTwoLong to reduce hash collision
- HIVE-20930 VectorCoalesce in FILTER mode doesn’t take effect
- HIVE-20940 Bridge cases in which Calcite’s type resolution is more stricter than Hive.
- HIVE-20949 Improve PKFK cardinality estimation in physical planning
- HIVE-20952 Cleaning VectorizationContext.java
- HIVE-20818 Views created with a WHERE subquery will regard views referenced in the subquery as direct input
- HIVE-20940 Bridge cases in which Calcite’s type resolution is more stricter than Hive.
- HIVE-20937 Postgres jdbc query fail with “LIMIT must not be negative”
- HIVE-14557 Nullpointer When both SkewJoin and Mapjoin Enabled
- HIVE-20918 Flag to enable/disable pushdown of computation from Calcite into JDBC connection
- HIVE-20888 TxnHandler: sort() called on immutable lists
- HIVE-20910 Insert in bucketed table fails due to dynamic partition sort optimization
- HIVE-20905 querying streaming table fails with out of memory exception
- HIVE-20676 HiveServer2: PrivilegeSynchronizer is not set to daemon status
- HIVE-19701 getDelegationTokenFromMetaStore doesn’t need to be synchronized
- HIVE-20682 Async query execution can potentially fail if shared sessionHive is closed by master thread
- HIVE-20893 Fix thread safety issue for bloomK probing filter
- HIVE-20881 Constant propagation oversimplifies projections
- HIVE-20886 Fix NPE: GenericUDFLower
- HIVE-20813 udf to_epoch_milli need to support timestamp without time zone as well.
- HIVE-16839 Unbalanced calls to openTransaction/commitTransaction when alter the same partition concurrently
- HIVE-20868 SMB Join fails intermittently when TezDummyOperator has child op in getFinalOp in MapRecordProcessor
- HIVE-20858 Serializer is not correctly initialized with configuration in Utilities.createEmptyBuckets
- HIVE-20486 Vectorization support for Kafka Storage Handler
- HIVE-20839 “Cannot find field” error during dynamically partitioned hash join
- HIVE-20486 Vectorization support for Kafka Storage Handler
- HIVE-20796 jdbc URL can contain sensitive information that should not be logged
- HIVE-20805 Hive does not copy source data when importing as non-hive user
- HIVE-20817 Reading Timestamp datatype via HiveServer2 gives errors
- HIVE-20834 Hive QueryResultCache entries keeping reference to SemanticAnalyzer from cached query
- HIVE-20815 JdbcRecordReader.next shall not eat exception
- HIVE-20821 Rewrite SUM0 into SUM + COALESCE combination
- HIVE-20617 Fix type of constants in IN expressions to have correct type
- HIVE-20830 JdbcStorageHandler range query assertion failure in some cases
- HIVE-20829 JdbcStorageHandler range split throws NPE
- HIVE-20820 MV partition on clause position
- HIVE-20792 Inserting timestamp with zones truncates the data
- HIVE-20638 Upgrade version of Jetty to 9.3.25.v20180904
- HIVE-14516 OrcInputFormat.SplitGenerator.callInternal() can be optimized
- HIVE-18876 Remove Superfluous Logging in Driver
- HIVE-20490 UDAF: Add an ‘approx_distinct’ to Hive
- HIVE-20768 Adding Tumbling Window UDF
- HIVE-20763 Add google cloud storage (gs) to the exim uri schema whitelist
- HIVE-20762 NOTIFICATION_LOG cleanup interval is hardcoded as 60s and is too small
- HIVE-20477 OptimizedSql is not shown if the expression contains INs
- HIVE-20720 Add partition column option to JDBC handler
- HIVE-20735 Adding Support for Kerberos Auth, Removed start/end offset columns, remove the best effort mode and made 2pc default for EOS
- HIVE-20761 Select for update on notification_sequence table has retry interval and retries count too small
- HIVE-20731 keystore file in JdbcStorageHandler should be authorized (Add missing file)
- HIVE-20731 keystore file in JdbcStorageHandler should be authorized
- HIVE-20509 Plan: fix wasted memory in plans with large partition counts
- HIVE-20714 SHOW tblproperties for a single property returns the value in the name column
- HIVE-20649 LLAP aware memory manager for Orc writers
- HIVE-20696 msck_*.q tests are broken
- HIVE-20702 Account for overhead from datastructure aware estimations during mapjoin selection
- HIVE-20704 Extend HivePreFilteringRule to support other functions
- HIVE-20385 Date: date + int fails to add days
- HIVE-20644 Avoid exposing sensitive infomation through a Hive Runtime exception
- HIVE-20712 HivePointLookupOptimizer should extract deep cases
- HIVE-20705 Vectorization: Native Vector MapJoin doesn’t support Complex Big Table values
- HIVE-20678 HiveHBaseTableOutputFormat should implement HiveOutputFormat to ensure compatibility
- HIVE-20710 Constant folding may not create null constants without types
- HIVE-20639 Add ability to Write Data from Hive Table/Query to Kafka Topic
- HIVE-20648 LLAP: Vector group by operator should use memory per executor
- HIVE-20711 Race Condition when Multi-Threading in SessionState.createRootHDFSDir
- HIVE-20692 Enable folding of NOT x IS (NOT) [TRUE|FALSE] expressions
- HIVE-20623 Shared work: Extend sharing of map-join cache entries in LLAP
- HIVE-20651 JdbcStorageHandler password should be encrypted
- HIVE-14431 Recognize COALESCE as CASE
- HIVE-20646 Partition filter condition is not pushed down to metastore query if it has IS NOT NULL
- HIVE-20652 JdbcStorageHandler push join of two different datasource to jdbc driver
- HIVE-20563 Vectorization: CASE WHEN expression fails when THEN/ELSE type and result type are different
- HIVE-20544 TOpenSessionReq logs password and username
- HIVE-20691 Fix org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[cttl]
- HIVE-20338 LLAP: Force synthetic file-id for filesystems which have HDFS protocol impls with POSIX mutation semantics
- HIVE-20657 pre-allocate LLAP cache at init time
- HIVE-20609 Create SSD cache dir if it doesnt exist already
- HIVE-20618 During join selection BucketMapJoin might be choosen for non bucketed tables
- HIVE-10296 Cast exception observed when hive runs a multi join query on metastore (postgres), since postgres pushes the filter into the join, and ignores the condition before applying cast
- HIVE-20619 Include MultiDelimitSerDe in HiveServer2 By Default
- HIVE-20637 Allow any udfs with 0 arguments or with constant arguments as part of default clause
- HIVE-20552 Get Schema from LogicalPlan faster
- HIVE-20627 Concurrent async queries intermittently fails with LockException and cause memory leak
- HIVE-19302 Logging Too Verbose For TableNotFound
- HIVE-20540 Vectorization : Support loading bucketed tables using sorted dynamic partition optimizer - II
- HIVE-18871 hive on tez execution error due to set hive.aux.jars.path to hdfs://
- HIVE-20603 “Wrong FS” error when inserting to partition after changing table location filesystem
- HIVE-20620 manifest collisions when inserting into bucketed sorted MM tables with dynamic partitioning
- HIVE-20625 Regex patterns not working in SHOW MATERIALIZED VIEWS ‘’
- HIVE-20095 Fix feature to push computation to jdbc external tables
- HIVE-20621 GetOperationStatus called in resultset.next causing incremental slowness
- HIVE-20568 There is no need to convert the dbname to pattern while pulling tablemeta
- HIVE-20507 Beeline: Add a utility command to retrieve all uris from beeline-site.xml
- HIVE-20498 Support date type for column stats autogather
- HIVE-20536 Add Surrogate Keys function to Hive
- HIVE-20570 Fix plan for query with hive.optimize.union.remove set to true
- HIVE-20583 Use canonical hostname only for kerberos auth in HiveConnection
- HIVE-20561 Use the position of the Kafka Consumer to track progress instead of Consumer Records offsets
- HIVE-20494 GenericUDFRestrictInformationSchema is broken after HIVE-19440
- HIVE-20163 Simplify StringSubstrColStart Initialization
- HIVE-20558 Change default of hive.hashtable.key.count.adjustment to 0.99
- HIVE-20462 “CREATE VIEW IF NOT EXISTS” fails if view already exists
- HIVE-20524 Schema Evolution checking is broken in going from Hive version 2 to version 3 for ALTER TABLE VARCHAR to DECIMAL
- HIVE-20537 Multi-column joins estimates with uncorrelated columns different in CBO and Hive
- HIVE-20541 REPL DUMP on external table with add partition event throws NoSuchElementException
- HIVE-20503 Use datastructure aware estimations during mapjoin selection
- HIVE-20412 NPE in HiveMetaHook
- HIVE-20471 issues getting the default database path
- HIVE-18038 org.apache.hadoop.hive.ql.session.OperationLog - Review
- HIVE-20296 Improve HivePointLookupOptimizerRule to be able to extract from more sophisticated contexts
- HIVE-17921 Aggregation with struct in LLAP produces wrong result
- HIVE-20020 Hive contrib jar should not be in lib
- HIVE-20481 Add the Kafka Key record as part of the row
- HIVE-20514 Query with outer join filter is failing with dynamic partition join
- HIVE-20526 Add test case for HIVE-20489
- HIVE-20489 Recursive calls to intern path strings causes parse to hang
- HIVE-20502 Fix NPE while running skewjoin_mapjoin10.q when column stats is used.
- HIVE-20513 Vectorization: Improve Fast Vector MapJoin Bytes Hash Tables
- HIVE-20508 Hive does not support user names of type “user@realm”
- HIVE-20522 HiveFilterSetOpTransposeRule may throw assertion error due to nullability of fields
- HIVE-20510 Vectorization : Support loading bucketed tables using sorted dynamic partition optimizer
- HIVE-20515 Empty query results when using results cache and query temp dir, results cache dir in different filesystems
- HIVE-20499 GetTablesOperation pull all the tables meta irrespective of auth.
- HIVE-18725 Improve error handling for subqueries if there is wrong column reference
- HIVE-20236 Clean up printStackTrace() in DDLTask
- HIVE-15932 Add support for: “explain ast”
- HIVE-20377 Hive Kafka Storage Handler
- HIVE-19662 Upgrade Avro to 1.8.2
- HIVE-19993 Using a table alias which also appears as a column name is not possible
- HIVE-20432 Rewrite BETWEEN to IN for integer types for stats estimation
- HIVE-20437 Handle schema evolution from Float, Double and Decimal.
- HIVE-20491 Fix mapjoin size estimations for Fast implementation
- HIVE-20395 Parallelize files move in ‘replaceFiles’ method.
- HIVE-20476 CopyUtils used by REPL LOAD and EXPORT/IMPORT operations ignore distcp error
- HIVE-20496 Vectorization: Vectorized PTF IllegalStateException
- HIVE-20466 Improve org.apache.hadoop.hive.ql.exec.FunctionTask Experience
- HIVE-20225 SerDe to support Teradata Binary Format
- HIVE-20433 Implicit String to Timestamp conversion is slow
- HIVE-20465 ProxyFileSystem.listStatusIterator function override required once migrated to Hadoop 3.2.0+
- HIVE-20467 Allow IF NOT EXISTS/IF EXISTS in Resource plan creation/drop
- HIVE-20439 addendum
- HIVE-20439 Use the inflated memory limit during join selection for llap
- HIVE-20013 Add an Implicit cast to date type for to_date function
- HIVE-20187 Incorrect query results in hive when hive.convert.join.bucket.mapjoin.tez is set to true
- HIVE-20455 Log spew from security.authorization.PrivilegeSynchonizer.run
- HIVE-20315 Vectorization: Fix more NULL / Wrong Results issues and avoid unnecessary casts/conversions
- HIVE-20339 Vectorization: Lift unneeded restriction causing some PTF with RANK not to be vectorized
- HIVE-20258 Should Syncronize getInstance in ReplChangeManager
- HIVE-20352 Vectorization: Support grouping function
- HIVE-20367 Vectorization: Support streaming for PTF AVG, MAX, MIN, SUM
- HIVE-20399 CTAS w/a custom table location that is not fully qualified fails for MM tables
- HIVE-20443 txn stats cleanup in compaction txn handler is unneeded
- HIVE-20418 LLAP IO may not handle ORC files that have row index disabled correctly for queries with no columns selected
- HIVE-20409 Hive ACID: Update/delete/merge does not clean hdfs staging directory
- HIVE-20246 Configurable collecting stats by using DO_NOT_UPDATE_STATS table property
- HIVE-20237 Do Not Print StackTraces to STDERR in HiveMetaStore
- HIVE-20366 TPC-DS query78 stats estimates are off for is null filter
- HIVE-17979 Tez: Improve ReduceRecordSource passDownKey copying
- HIVE-20406 Addendum patch
- HIVE-20406 Nested Coalesce giving incorrect results
- HIVE-20368 Remove VectorTopNKeyOperator lock
- HIVE-20410 aborted Insert Overwrite on transactional table causes “Not enough history available for…” error
- HIVE-20400 create table should always use a fully qualified path to avoid potential FS ambiguity
- HIVE-20321 Vectorization: Cut down memory size of 1 col VectorHashKeyWrapper to <1 CacheLine
- HIVE-19254 NumberFormatException in MetaStoreUtils.isFastStatsSame
- HIVE-20391 HiveAggregateReduceFunctionsRule may infer wrong return type when decomposing aggregate function
- HIVE-14898 HS2 shouldn’t log callstack for an empty auth header error
- HIVE-20389 NPE in SessionStateUserAuthenticator when authenticator=SessionStateUserAuthenticator
- HIVE-18620 Improve error message while dropping a table that is part of a materialized view
- HIVE-20345 Drop database may hang if the tables get deleted from a different call
- HIVE-20379 Rewriting with partitioned materialized views may reference wrong column
- HIVE-19316 StatsTask fails due to ClassCastException
- HIVE-20329 Long running repl load (incr/bootstrap) causing OOM error
- HIVE-19924 Tag distcp jobs run by Repl Load
- HIVE-20354 Semijoin hints dont work with merge statements
- HIVE-20350 Unnecessary value assignment
- HIVE-20340 Druid Needs Explicit CASTs from Timestamp to STRING when the output of timestamp function is used as String
- HIVE-20344 PrivilegeSynchronizer for SBA might hit AccessControlException
- HIVE-20316 Skip external table file listing for create table event
- HIVE-20336 Masking and filtering policies for materialized views
- HIVE-20337 CachedStore: getPartitionsByExpr is not populating the partition list correctly
- HIVE-20279 HiveContextAwareRecordReader slows down Druid Scan queries.
- HIVE-20136 Code Review of ArchiveUtils Class
- HIVE-20335 Add tests for materialized view rewriting with composite aggregation functions
- HIVE-20326 Create constraints with RELY as default instead of NO RELY
- HIVE-19408 Improve show materialized views statement to show more information about invalidation
- HIVE-20118 SessionStateUserAuthenticator.getGroupNames() is always empty
- HIVE-20290 Lazy initialize ArrowColumnarBatchSerDe so it doesn’t allocate buffers during GetSplits
- HIVE-20278 Druid Scan Query avoid copying from List -> Map -> List
- HIVE-20277 Vectorization: Case expressions that return BOOLEAN are not supported for FILTER
- HIVE-19937 Intern fields in MapWork on deserialization
- HIVE-19097 related equals and in operators may cause inaccurate stats estimations
- HIVE-20162 Logging cleanup, avoid printing stacktraces to stderr
- HIVE-20314 Include partition pruning in materialized view rewriting
- HIVE-20301 Enable vectorization for materialized view rewriting tests
- HIVE-20302 LLAP: non-vectorized execution in IO ignores virtual columns, including ROW__ID
- HIVE-20294 Vectorization: Fix NULL / Wrong Results issues in COALESCE / ELT
- HIVE-20274 HiveServer2 ObjectInspectorFactory leaks for Struct and List object inspectors
- HIVE-20166 LazyBinaryStruct Warn Level Logging
- HIVE-20281 SharedWorkOptimizer fails with ‘operator cache contents and actual plan differ’
- HIVE-20169 Print Final Rows Processed in MapOperator
- HIVE-20239 Do Not Print StackTraces to STDERR in MapJoinProcessor
- HIVE-20260 NDV of a column shouldn’t be scaled when row count is changed by filter on another column
- HIVE-14493 Partitioning support for materialized views
- HIVE-18201 Disable XPROD_EDGE for sq_count_check() created for scalar subqueries
- HIVE-20130 Better logging for information schema synchronizer
- HIVE-20244 forward port HIVE-19704 to master
- HIVE-20101 BloomKFilter: Avoid using the local byte[] arrays entirely
- HIVE-19199 ACID: DbTxnManager heartbeat-service needs static sync init
- HIVE-20040 JDBC: HTTP listen queue is 50 and SYNs are lost
- HIVE-20177 Vectorization: Reduce KeyWrapper allocation in GroupBy Streaming mode
- HIVE-19694 Create Materialized View statement should check for MV name conflicts before running MV’s SQL statement.
- HIVE-20245 Vectorization: Fix NULL / Wrong Results issues in BETWEEN / IN
- HIVE-20210 Simple Fetch optimizer should lead to MapReduce when filter on non-partition column and conversion is minimal
- HIVE-20247 cleanup issues in LLAP IO after cache OOM
- HIVE-20247 cleanup issues in LLAP IO after cache OOM
- HIVE-20249 LLAP IO: NPE during refCount decrement
- HIVE-19809 Remove Deprecated Code From Utilities Class
- HIVE-20168 ReduceSinkOperator Logging Hidden
- HIVE-20263 Typo in HiveReduceExpressionsWithStatsRule variable
- HIVE-20209 Metastore connection fails for first attempt in repl dump
- HIVE-20221 Increase column width for partition_params
- HIVE-19770 Support for CBO for queries with multiple same columns in select
- HIVE-19181 Remove BreakableService (unused class)
- HIVE-20035 write booleans as long when serializing to druid
- HIVE-20105 Druid-Hive: tpcds query on timestamp throws java.lang.IllegalArgumentException: Cannot create timestamp, parsing error
- HIVE-18729 Druid Time column type
- HIVE-20213 Upgrade Calcite to 1.17.0
- HIVE-20212 Hiveserver2 in http mode emitting metric default.General.open_connections incorrectly
- HIVE-20153 Count and Sum UDF consume more memory in Hive 2+
- HIVE-20240 Semijoin Reduction : Use local variable to check for external table condition
- HIVE-20156 Avoid printing exception stacktrace to STDERR
- HIVE-20228 configure repl configuration directories based on user running hiveserver2
- HIVE-18929 The method humanReadableInt in HiveStringUtils.java has a race condition.
- HIVE-20158 Do Not Print StackTraces to STDERR in Base64TextOutputFormat
- HIVE-20242 Query results cache: Improve ability of queries to use pending query results
- HIVE-20203 Arrow SerDe leaks a DirectByteBuffer
- HIVE-20015 Populate ArrayList with Constructor
- HIVE-20207 Vectorization: Fix NULL / Wrong Results issues in Filter / Compare
- HIVE-18852 Misleading error message in alter table validation
- HIVE-19935 Hive WM session killed: Failed to update LLAP tasks count
- HIVE-20082 HiveDecimal to string conversion doesn’t format the decimal correctly
- HIVE-20164 Murmur Hash : Make sure CTAS and IAS use correct bucketing version
- HIVE-19891 inserting into external tables with custom partition directories may cause data loss
- HIVE-17683 Add explain locks command
- HIVE-20192 HS2 with embedded metastore is leaking JDOPersistenceManager objects
- HIVE-20204 Type conversion during IN () comparisons is using different rules from other comparison operations
- HIVE-20218 make sure Statement.executeUpdate() returns number of rows affected
- HIVE-20127 fix some issues with LLAP Parquet cache
- HIVE-4367 enhance TRUNCATE syntax to drop data of external table
- HIVE-17896 (addendum) order3.q fails with NullPointerException if hive.cbo.enable=false and hive.optimize.topnkey=true
- HIVE-17896 TopNKey: Create a standalone vectorizable TopNKey operator
- HIVE-20149 TestHiveCli failing/timing out
- HIVE-19360 CBO: Add an “optimizedSQL” to QueryPlan object
- HIVE-20201 Hive HBaseHandler code should not use deprecated Base64 implementation
- HIVE-20120 Incremental repl load DAG generation is causing OOM error
- HIVE-20183 Inserting from bucketed table can cause data loss, if the source table contains empty bucket
- HIVE-20197 Vectorization: Add DECIMAL_64 testing, add Date/Interval/Timestamp arithmetic, and add more GROUP BY Aggregation tests
- HIVE-20172 StatsUpdater failed with GSS Exception while trying to connect to remote metastore
- HIVE-20165 Enable ZLIB for streaming ingest
- HIVE-20116 TezTask is using parent logger
- HIVE-20152 reset db state, when repl dump fails, so rename table can be done
- HIVE-19668 Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken’s and duplicate strings
- HIVE-19940 Push predicates with deterministic UDFs with RBO
- HIVE-20174 Vectorization: Fix NULL / Wrong Results issues in GROUP BY Aggregation Functions
- HIVE-19992 Vectorization: Follow-on to HIVE-19951 –> add call to SchemaEvolution.isOnlyImplicitConversion to disable encoded LLAP I/O for ORC only when data type conversion is not implicit
- HIVE-20185 Backport HIVE-20111 to branch-3
- HIVE-20147 Hive streaming ingest is contented on synchronized logging
- HIVE-20041 ResultsCache: Improve logging for concurrent queries
- HIVE-20091 Tez: Add security credentials for FileSinkOperator output
- HIVE-19375 Bad message: ‘transactional'='false’ is no longer a valid property and will be ignored
- HIVE-20019 Ban commons-logging and log4j
- HIVE-20088 Beeline config location path is assembled incorrectly
- HIVE-15974 Support real, double precision and numeric data types
- HIVE-20069 Fix reoptimization in case of DPP and Semijoin optimization
- HIVE-19387 Truncate table for Acid tables conflicts with ResultSet cache
- HIVE-20093 LlapOutputFomatService: Use ArrowBuf with Netty for Accounting
- HIVE-20129 Revert to position based schema evolution for orc tables
- HIVE-19765 Add Parquet specific tests to BlobstoreCliDriver
- HIVE-18545 Add UDF to parse complex types from json
- HIVE-20100 OpTraits : Select Optraits should stop when a mismatch is detected
- HIVE-20043 HiveServer2: SessionState has a static sync block around an AtomicBoolean
- HIVE-20099 Fix logger for LlapServlet
- HIVE-20184 Backport HIVE-20085 to branch-3
- HIVE-20098 Statistics: NPE when getting Date column partition statistics
- HIVE-20182 Backport HIVE-20067 to branch-3
- HIVE-19850 Dynamic partition pruning in Tez is leading to ‘No work found for tablescan’ error
- HIVE-20066 hive.load.data.owner is compared to full principal
- HIVE-20039 Bucket pruning: Left Outer Join on bucketed table gives wrong result
- HIVE-19860 HiveServer2 ObjectInspectorFactory memory leak with cachedUnionStructObjectInspector
- HIVE-19326 stats auto gather: incorrect aggregation during UNION queries
- HIVE-20051 Skip authorization for temp tables
- HIVE-17840 HiveMetaStore eats exception if transactionalListeners.notifyEvent fail
- HIVE-20059 Hive streaming should try shade prefix unconditionally on exception
- HIVE-19951 Vectorization: Need to disable encoded LLAP I/O for ORC when there is data type conversion
- HIVE-19812 Disable external table replication by default via a configuration property
- HIVE-20008 Fix second compilation errors in ql
- HIVE-20038 Update queries on non-bucketed + partitioned tables throws NPE
- HIVE-19995 Aggregate row traffic for acid tables
- HIVE-19970 Replication dump has a NPE when table is empty
- HIVE-20028 Metastore client cache config is used incorrectly
- HIVE-20004 Wrong scale used by ConvertDecimal64ToDecimal results in incorrect results
- HIVE-20009 Fix runtime stats for merge statement
- HIVE-19989 Metastore uses wrong application name for HADOOP2 metrics
- HIVE-19967 SMB Join : Need Optraits for PTFOperator ala GBY Op
- HIVE-20011 Move away from append mode in proto logging hook
- HIVE-18786 NPE in Hive windowing functions
- HIVE-19404 Revise DDL Task Result Logging
- HIVE-19829 Incremental replication load should create tasks in execution phase rather than semantic phase
- HIVE-18140 Partitioned tables statistics can go wrong in basic stats mixed case
- HIVE-19981 Managed tables converted to external tables by the HiveStrictManagedMigration utility should be set to delete data when the table is dropped
- HIVE-19888 Misleading “METASTORE_FILTER_HOOK will be ignored” warning from SessionState
- HIVE-19948 HiveCli is not splitting the command by semicolon properly if quotes are inside the string
- HIVE-19564 Vectorization: Fix NULL / Wrong Results issues in Arithmetic
- HIVE-19783 Retrieve only locations in HiveMetaStore.dropPartitionsAndGetLocations
- HIVE-19870 HCatalog dynamic partition query can fail, if the table path is managed by Sentry
- HIVE-19718 Adding partitions in bulk also fetches table for each partition
- HIVE-19866 improve LLAP cache purge
- HIVE-19663 refactor LLAP IO report generation
- HIVE-19203 Thread-Safety Issue in HiveMetaStore
- HIVE-16505 Support “unknown” boolean truth value
- HIVE-19759 Flaky test: TestRpc#testServerPort
- HIVE-19432 GetTablesOperation is too slow if the hive has too many databases and tables
- HIVE-19524 pom.xml typo: “commmons-logging” groupId
- HIVE-6980 Drop table by using direct sql
- HIVE-19579 remove HBase transitive dependency that drags in some snapshot
- HIVE-19609 pointless callstacks in the logs as usual
- HIVE-19424 NPE In MetaDataFormatters
- HIVE-18906 Lower Logging for “Using direct SQL”
- HIVE-19041 Thrift deserialization of Partition objects should intern fields
- HIVE-18881 Lower Logging for FSStatsAggregator
- HIVE-18903 Lower Logging Level for ObjectStore
- HIVE-18880 Change Log to Debug in CombineHiveInputFormat
- HIVE-19285 Add logs to the subclasses of MetaDataOperation
- HIVE-18986 Table rename will run java.lang.StackOverflowError in dataNucleus if the table contains large number of columns
- HIVE-19204 Detailed errors from some tasks are not displayed to the client because the tasks don’t set exception when they fail
- HIVE-19104 When test MetaStore is started with retry the instances should be independent
- HIVE-16861 MapredParquetOutputFormat - Save Some Array Allocations
- HIVE-18827 useless dynamic value exceptions strike back
- HIVE-19133 HS2 WebUI phase-wise performance metrics not showing correctly
- HIVE-19263 Improve ugly exception handling in HiveMetaStore
- HIVE-19265 Potential NPE and hiding actual exception in Hive#copyFiles
- HIVE-19158 Fix NPE in the HiveMetastore add partition tests
- HIVE-24331 Add Jenkinsfile for branch-3.1 (#1626)
- HIVE-19170 Fix TestMiniDruidKafkaCliDriver – addendum patch
- HIVE-19170 Fix TestMiniDruidKafkaCliDriver
- HIVE-23323 Add qsplits profile
- HIVE-23044 Make sure Cleaner doesn’t delete delta directories for running queries
- HIVE-23088 Using Strings from log4j breaks non-log4j users
- HIVE-22704 Distribution package incorrectly ships the upgrade.order files from the metastore module
- HIVE-22708 Fix for HttpTransport to replace String.equals
- HIVE-22407 Hive metastore upgrade scripts have incorrect (or outdated) comment syntax
- HIVE-22241 Implement UDF to interpret date/timestamp using its internal representation and Gregorian-Julian hybrid calendar
- HIVE-21508 ClassCastException when initializing HiveMetaStoreClient on JDK10 or newer
- HIVE-19667 Remove distribution management tag from pom.xml
- HIVE-21980 Parsing time can be high in case of deeply nested subqueries
- HIVE-22105 Update ORC to 1.5.6 in branch-3
- HIVE-20057 For ALTER TABLE t SET TBLPROPERTIES (‘EXTERNAL'='TRUE’);
TBL_TYPE
attribute change not reflecting for non-CAPS - HIVE-21872 Bucketed tables that load data from data/files/auto_sortmerge_join should be tagged as ‘bucketing_version'='1’
- HIVE-18874 JDBC: HiveConnection shades log4j interfaces
- HIVE-21821 Backport HIVE-21739 to branch-3.1
- HIVE-21786 Update repo URLs in poms - branh 3.1 version
- HIVE-21755 Backport HIVE-21462 to branch-3 Upgrading SQL server backed metastore when changing data type of a column with constraints
- HIVE-21758 DBInstall tests broken on master and branch-3.1
- HIVE-21291 Restore historical way of handling timestamps in Avro while keeping the new semantics at the same time
- HIVE-21564 Load data into a bucketed table is ignoring partitions specs and loads data into default partition
- HIVE-20593 Load Data for partitioned ACID tables fails with bucketId out of range: -1
- HIVE-21600 GenTezUtils.removeSemiJoinOperator may throw out of bounds exception for TS with multiple children
- HIVE-21613 Queries with join condition having timestamp or timestamp with local time zone literal throw SemanticException
- HIVE-18624 Parsing time is extremely high (~10 min) for queries with complex select expressions
- HIVE-21540 Query with join condition having date literal throws SemanticException
- HIVE-21342 Analyze compute stats for column leave behind staging dir on hdfs
- HIVE-21290 Restore historical way of handling timestamps in Parquet while keeping the new semantics at the same time
- HIVE-20126 OrcInputFormat does not pass conf to orc reader options
- HIVE-21376 Incompatible change in Hive bucket computation
- HIVE-21236 SharedWorkOptimizer should check table properties
- HIVE-21156 SharedWorkOptimizer may preserve filter in TS incorrectly
- HIVE-21039 CURRENT_TIMESTAMP returns value in UTC time zone
- HIVE-20010 Fix create view over literals
- HIVE-20420 Provide a fallback authorizer when no other authorizer is in use
- HIVE-18767 Some alterPartitions invocations throw ‘NumberFormatException: null’
- HIVE-18778 Needs to capture input/output entities in explain
- HIVE-20555 HiveServer2: Preauthenticated subject for http transport is not retained for entire duration of http communication in some cases
- HIVE-20227 Exclude glassfish javax.el dependency
- HIVE-19027 Make materializations invalidation cache work with multiple active remote metastores
- HIVE-20102 Add a couple of additional tests for query parsing
- HIVE-20123 Fix masking tests after HIVE-19617
- HIVE-20076 ACID: Fix Synthetic ROW__ID generation for vectorized orc readers
- HIVE-20135 Fix incompatible change in TimestampColumnVector to default to UTC