Release 1.12: 2024-11-13

MR3

  • Implement delay scheduling. Enabling delay scheduling with the configuration key mr3.taskattempt.queue.scheme.use.delay is recommended when using LLAP I/O.
  • mr3.am.session.share.dag.client.rpc specifies whether or not to create a new DAGClientRPC object for each DAG (in session mode).
  • Introduce mr3.yarn.priority to specify the priority of the MR3 Yarn application.
  • Support fault tolerance when using Celeborn 0.5.1.

Hive on MR3

  • Support Hive 4.0.1.

Release 1.11: 2024-7-21

MR3

  • Introduce mr3.dag.timeout.kill.threshold.secs and mr3.dag.timeout.kill.check.ms for checking DAG timeout.
  • mr3.daemon.task.message.buffer.size specifies the message queue size for DaemonTasks.

Hive on MR3

  • The cache hit ratio of LLAP I/O is usually higher and more stable because LLAP I/O (with LlapInputFormat) is used only when a Task is placed on nodes matching its location hints.
  • LimitOperator is correctly controlled by MR3 DAGAppMaster (which implements HIVE-24207).
  • Support Hive 4.0.0.

Release 1.10: 2024-3-12

MR3

  • Every ContainerWorker runs a central shuffle server which manages all Fetchers from all TaskAttempts.
    • All Fetchers share a common thread pool.
    • The shuffle server does not distinguish between ordered and ordered fetches.
    • The shuffle server controls the maximum number of concurrent fetches for each input (with tez.runtime.shuffle.parallel.copies).
    • The shuffle server controls the total number of concurrent fetches (with tez.runtime.shuffle.total.parallel.copies).
  • Adjust the default configuration in tez-site.xml for shuffling:
    • tez.runtime.shuffle.parallel.copies to 10
    • tez.runtime.shuffle.total.parallel.copies to 360
    • tez.runtime.shuffle.read.timeout to 60000 (60 seconds)
  • Introduce mr3.dag.create.daemon.vertex.always to control whether or not to create DaemonVertexes in DAGs (with the default value of false).
  • Fix a bug in speculative execution where a Task is killed after OutOfMemoryError while TaskAttempts are still running.

Release 1.9: 2024-1-7

MR3

  • Introduce tez.runtime.use.free.memory.fetched.input to use free memory for storing fetched data.
  • The default value of tez.runtime.transfer.data-via-events.max-size increases from 512 to 2048.
  • Tasks can be canceled if no more output records are needed (as part of incorporating HIVE-24207).

Hive on MR3

  • Execute TRUNCATE using MR3 instead of MapReduce.
  • hive.exec.orc.default.compress is set to SNAPPY in hive-site.xml.
  • Support Ranger 2.4.0.
  • Adjust the default configuration in hive-site.xml and tez-site.xml to use auto parallelism less aggressively.
    • tez.shuffle-vertex-manager.auto-parallel.min.num.tasks to 251
    • tez.shuffle-vertex-manager.auto-parallel.max.reduction.percentage to 50
  • Set metastore.stats.fetch.bitvector to true in hive-site.xml.

Release 1.8: 2023-12-9

MR3

  • Shuffle handlers can send multiple consecutive partitions at once.
  • Fix a bug in TaskScheduler which can get stuck when the number of ContainerWorkers is smaller than the value for mr3.am.task.max.failed.attempts.
  • Avoid unnecessary attempts to delete directores created by DAGs.
  • mr3.taskattempt.queue.scheme can be set to spark to use a Spark-style TaskScheduler which schedules consumer Tasks after all producer Tasks are finished.
  • mr3.dag.vertex.schedule.by.stage can be set to true to process Vertexes by stages similarly to Spark.
  • YarnResourceScheduler does not use AMRMClient.getAvailableResources() which returns incorrect values in some cases.
  • Restore TEZ_USE_MINIMAL in env.sh.
  • Support Celeborn as remote suffle service.
  • mr3.dag.include.indeterminate.vertex specifies whether a DAG contains indeterminate Vertexes or not.
  • Fault tolerance in the event of disks failures works much faster.
  • Use Scala 2.12.
  • Support Java 17 (with USE_JAVA_17 in env.sh).

Hive on MR3

  • Fix ConcurrentModificationException generated during the construction of DAGs.
  • hive.mr3.application.name.prefix specifies the prefix of MR3 application names.
  • Fix a bug that ignores CTRL-C in Beeline and stop request from Hue.
  • hive.mr3.config.remove.keys specifies configuration keys to remove from JobConf to be passed to Tez.
  • hive.mr3.config.remove.prefixes specifies prefixes of configuration keys to remove from JobConf to be passed to Tez.

Release 1.7: 2023-5-15

MR3

  • Support standalone mode which does not require Yarn or Kubernetes as the resource manager.

Hive on MR3

  • Use Hadoop 3.3.1.
  • hive.query.reexecution.stats.persist.scope can be set to hiveserver.
  • HIVE_JVM_OPTION in env.sh specifies the JMV options for Metastore and HiveServer2.
  • Do not use TEZ_USE_MINIMAL in env.sh.

Release 1.6: 2022-12-24

MR3

  • Support capacity scheduling with mr3.dag.queue.capacity.specs and mr3.dag.queue.name.

Release 1.5: 2022-7-24

MR3

  • Use liveness probes on ContainerWorker Pods running separate processes for shuffle handlers.
  • When a ContainerGroup is removed, all its Prometheus metrics are removed.
  • Prometheus metrics are correctly published when two DAGAppMaster Pods for Hive and Spark can run concurrently in the same namespace on Kubernetes.
  • DAGAppMaster stops if it fails to contact Timeline Server during initialization.
  • Introduce mr3.k8s.master.pod.cpu.limit.multiplier for a multiplier for the CPU resource limit for DAGAppMaster Pods.
  • Using MasterControl, autoscaling parameters can be updated dynamically.
  • HistoryLogger correctly sends Vertex start times to Timeline Server.

Hive on MR3

  • Support Hive 3.1.3.

Spark on MR3

  • Support Spark 3.2.2.
  • Reduce the size of Protobuf objects when submitting DAGs to MR3.
  • Spark executors can run as MR3 ContainerWorkers in local mode.

Release 1.4: 2022-2-14

MR3

  • Use Deployment instead of ReplicationController on Kubernetes.
  • HistoryLogger correctly sends Vertex finish times to Timeline Server.
  • Add more Prometheus metrics.
  • Introduce mr3.application.tags and mr3.application.scheduling.properties.map.
  • The logic for speculative execution uses the average execution time of Tasks (instead of the maximum execution time).

Hive on MR3

  • DistCp jobs are sent to MR3, not to Hadoop. As a result, DistCp runs okay on Kubernetes.
  • org.apache.tez.common.counters.Limits is initialized in HiveServer2.
  • Update Log4j2 to 2.17.1 (for CVE-2021-44228).

Release 1.3: 2021-8-18

MR3

  • Separate mr3.k8s.keytab.secret and mr3.k8s.worker.secret.
  • Introduce mr3.container.max.num.workers to limit the number of ContainerWorkers.
  • Introduce mr3.k8s.pod.worker.node.affinity.specs to specify node affinity for ContainerWorker Pods.
  • No longer use mr3.convert.container.address.host.name.
  • Support ContainerWorker recycling (which is different from ContainerWorker reuse) with mr3.container.scheduler.scheme.
  • Introduce mr3.am.task.no.retry.errors to specify the names of errors that prevent the re-execution of Tasks (e.g., OutOfMemoryError,MapJoinMemoryExhaustionError).
  • For reporting to MR3-UI, MR3 uses System.currentTimeMillis() instead of MonotonicClock.
  • DAGAppMaster correctly reports to MR3Client the time from DAG submission to DAG execution.
  • Introduce mr3.container.localize.python.working.dir.unsafe to localize Python scripts in working directories of ContainerWorkers. Localizing Python scripts is an unsafe operation: 1) Python scripts are shared by all DAGs; 2) once localized, Python scripts are not deleted.
  • The image pull policy specified in mr3.k8s.pod.image.pull.policy applies to init containers as well as ContainerWorker containers.
  • Introduce mr3.auto.scale.out.num.initial.containers which specifies the number of new ContainerWorkers to create in a scale-out operation when no ContainerWorkers are running.
  • Introduce mr3.container.runtime.auto.start.input to automatically start LogicalInputs in RuntimeTasks.
  • Speculative execution works on Vertexes with a single Task.

Hive on MR3

  • Metastore correctly uses MR3 for compaction on Kubernetes.
  • Auto parallelism is correctly enabled or disabled according to the result of compiling queries by overriding tez.shuffle-vertex-manager.enable.auto-parallel, so tez.shuffle-vertex-manager.enable.auto-parallel can be set to false.
  • Support the TRANSFORM clause with Python scripts (with mr3.container.localize.python.working.dir.unsafe set to true in mr3-site.xml).
  • Introduce hive.mr3.llap.orc.memory.per.thread.mb to specify the memory allocated to each ORC manager in low-level LLAP I/O threads.

Spark on MR3

  • Initial release

Release 1.2: 2020-10-26

MR3

  • Introduce mr3.k8s.pod.worker.init.container.command to execute a shell command in a privileged init container.
  • Introduce mr3.k8s.pod.master.toleration.specs and mr3.k8s.pod.worker.toleration.specs to specify tolerations for DAGAppMaster and ContainerWorker Pods.
  • Setting mr3.dag.queue.scheme to individual properly implements fair scheduling among concurrent DAGs.
  • Introduce mr3.k8s.pod.worker.additional.hostpaths to mount additional hostPath volumes.
  • mr3.k8s.worker.total.max.memory.gb and mr3.k8s.worker.total.max.cpu.cores work okay when autoscaling is enabled.
  • DAGAppMaster and ContainerWorkers can publish Prometheus metrics.
  • The default value of mr3.container.task.failure.num.sleeps is 0.
  • Reduce the log size of DAGAppMaster and ContainerWorker.
  • TaskScheduler can process about twice as many events (TaskSchedulerEventTaskAttemptFinished) per unit time as in MR3 1.1, thus doubling the maximum cluster size that MR3 can manage.
  • Optimize the use of CodecPool shared by concurrent TaskAttempts.
  • The getDags command of MasterControl prints both IDs and names of DAGs.
  • On Kubernetes, the updateResourceLimit command of MasterControl updates the limit on the total resources for all ContainerWorker Pods. The user can further improve resource utilization when autoscaling is enabled.

Hive on MR3

  • Compute the memory size of ContainerWorker correctly when hive.llap.io.allocator.mmap is set to true.
  • Hive expands all system properties in configuration files (such as core-site.xml) before passing to MR3.
  • hive.server2.transport.mode can be set to all (with HIVE-5312).
  • MR3 creates three ServiceAccounts: 1) for Metastore and HiveSever2 Pods; 2) for DAGAppMaster Pod; 3) for ContainerWorker Pods. The user can use IAM roles for ServiceAccounts.
  • Docker containers start as root. In kubernetes/env.sh, DOCKER_USER should be set to root and the service principal name in HIVE_SERVER2_KERBEROS_PRINCIPAL should be root.
  • Support Ranger 2.0.0 and 2.1.0.

Release 1.1: 2020-7-19

MR3

  • Support DAG scheduling schemes (specified by mr3.dag.queue.scheme).
  • Optimize DAGAppMaster by freeing memory for messages to Tasks when fault tolerance is disabled (with mr3.am.task.max.failed.attempts set to 1).
  • Fix a minor memory leak in DaemonTask (which also prevents MR3 from running more than 2^30 DAGs when using the shuffle handler).
  • Improve the chance of assigning TaskAttempts to ContainerWorkers that match location hints.
  • TaskScheduler can use location hints produced by ONE_TO_ONE edges.
  • TaskScheduler can use location hints from HDFS when assigning TaskAttempts to ContainerWorker Pods on Kubernetes (with mr3.convert.container.address.host.name).
  • Introduce mr3.k8s.pod.cpu.cores.max.multiplier to specify the multiplier for the limit of CPU cores.
  • Introduce mr3.k8s.pod.memory.max.multiplier to specify the multiplier for the limit of memory.
  • Introduce mr3.k8s.pod.worker.security.context.sysctls to configure kernel parameters of ContainerWorker Pods using init containers.
  • Support speculative execution of TaskAttempts (with mr3.am.task.concurrent.run.threshold.percent).
  • A ContainerWorker can run multiple shuffle handlers each with a different port. The configuration key mr3.use.daemon.shufflehandler now specifies the number of shuffle handlers in each ContainerWorker.
  • With speculative execution and the use of multiple shuffle handlers in a single ContainerWorker, fetch delays rarely occur.
  • A ContainerWorker Pod can run shuffle handlers in a separate container (with mr3.k8s.shuffle.process.ports).
  • On Kubernetes, DAGAppMaster uses ReplicationController instead of Pod, thus making recovery much faster.
  • On Kubernetes, ConfigMaps mr3conf-configmap-master and mr3conf-configmap-worker survive MR3, so the user should delete them manually.
  • Java 8u251/8u252 can be used on Kubernetes 1.17 and later.

Hive on MR3

  • CrossProductHandler asks MR3 DAGAppMaster to set TEZ_CARTESIAN_PRODUCT_MAX_PARALLELISM (Cf. HIVE-16690, Hive 3/4).
  • Hive 4 on MR3 is stable (currently using 4.0.0-SNAPSHOT).
  • No longer support Hive 1.
  • Ranger uses a local directory (emptyDir volume) for logging.
  • The open file limit for Solr (in Ranger) is not limited to 1024.
  • HiveServer2 and DAGAppMaster create readiness and liveness probes.

Release 1.0: 2020-2-17

MR3

  • Support DAG priority schemes (specified by mr3.dag.priority.scheme) and Vertex priority schemes (specified by mr3.vertex.priority.scheme).
  • Support secure shuffle (using SSL mode) without requiring separate configuration files.
  • ContainerWorker tries to avoid OutOfMemoryErrors by sleeping after a TaskAttempt fails (specified by mr3.container.task.failure.num.sleeps).
  • Errors from InputInitializers are properly passed to MR3Client.
  • MasterControl supports two new commands for gracefully stopping DAGAppMaster and ContainerWorkers.

Hive on MR3

  • Allow fractions for CPU cores (with hive.mr3.resource.vcores.divisor).
  • Support rolling updates.
  • Hive on MR3 can access S3 using AWS credentials (with or without Helm).
  • On Amazon EKS, the user can use S3 instead of PersistentVolumes on EFS.
  • Hive on MR3 can use environment variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY to access S3 outside Amazon AWS.

Release 0.11: 2019-12-4

MR3

  • Support autoscaling.

Hive on MR3

  • Memory and CPU cores for Tasks can be set to zero.
  • Support autoscaling on Amazon EMR.
  • Support autoscaling on Amazon EKS.

Release 0.10: 2019-10-18

MR3

  • TaskScheduler supports a new scheduling policy (specified by mr3.taskattempt.queue.scheme) which significantly improves the throughput for concurrent queries.
  • DAGAppMaster recovers from OutOfMemoryErrors due to the exhaustion of threads.

Hive on MR3

  • Compaction sends DAGs to MR3, instead of MapReduce, when hive.mr3.compaction.using.mr3 is set to true.
  • LlapDecider asks MR3 DAGAppMaster for the number of Reducers.
  • ConvertJoinMapJoin asks MR3 DAGAppMaster for the currrent number of Nodes to estimate the cost of Bucket Map Join.
  • Support Hive 3.1.2 and 2.3.6.
  • Support Helm charts.
  • Compaction works okay on Kubernetes.

Release 0.9: 2019-7-25

MR3

  • Each DAG uses its own ClassLoader.

Hive on MR3

  • LLAP I/O works properly on Kubernetes.
  • UDFs work okay on Kubernetes.

Release 0.8: 2019-6-22

MR3

  • A new DAGAppMaster properly recovers DAGs that have not been completed in the previous DAGAppMaster.
  • Fault tolerance after fetch failures works much faster.
  • On Kubernetes, the shutdown handler of DAGAppMaster deletes all running Pods.
  • On both Yarn and Kubernetes, MR3Client automatically connects to a new DAGAppMaster after an initial DAGAppMaster is killed.

Hive on MR3

  • Hive 3 for MR3 supports high availability on Yarn via ZooKeeper.
  • On both Yarn and Kubernetes, multiple HiveServer2 instances can share a common MR3 DAGAppMaster (and thus all its ContainerWorkers as well).
  • Support Apache Ranger on Kubernetes.
  • Support Timeline Server on Kubernetes.

Release 0.7: 2019-4-26

MR3

  • Resolve deadlock when Tasks fail or ContainerWorkers are killed.
  • Support fault tolerance after fetch failures.
  • Support node blacklisting.

Hive on MR3

  • Introduce a new configuration key hive.mr3.am.task.max.failed.attempts.
  • Apply HIVE-20618.

Release 0.6: 2019-3-21

MR3

  • DAGAppMaster can run in its own Pod on Kubernetes.
  • Support elastic execution of RuntimeTasks in ContainerWorkers.
  • MR3-UI requires only Timeline Server.

Hive on MR3

  • Support memory monitoring when loading hash tables for Map-side join.

Release 0.5: 2019-2-18

MR3

  • Support Kubernetes.
  • Support the use of the built-in shuffle handler.

Hive on MR3

  • Support Hive 3.1.1 and 2.3.5.
  • Initial release for Hive on MR3 on Kubernetes

Release 0.4: 2018-10-29

MR3

  • Support auto parallelism for reducers with ONE_TO_ONE edges.
  • Auto parallelism can use input statistics when reassigning partitions to reducers.
  • Support ByteBuffer sharing among RuntimeTasks.

Hive on MR3

  • Support Hive 3.1.0.
  • Hive 1 uses Tez 0.9.1.
  • Metastore checks the inclusion of __HIVE_DEFAULT_PARTITION__ when retrieving column statistics.
  • MR3JobMonitor returns immediately from MR3 DAGAppMaster when the DAG completes.

Release 0.3: 2018-8-15

MR3

  • Extend the runtime to support Hive 3.

Hive on MR3

  • Support Hive 3.0.0.
  • Support query re-execution.
  • Support per-query cache in Hive 2 and 3.

Release 0.2: 2018-5-18

MR3

  • Support asynchronous logging (with mr3.async.logging in mr3-site.xml).
  • Delete DAG-local directories after each DAG is finished.

Hive on MR3

  • Support LLAP I/O for Hive 2.
  • Support Hive 2.2.0.
  • Use Hive 2.3.3 instead of Hive 2.3.2.

Release 0.1: 2018-3-31

MR3

  • Initial release

Hive on MR3

  • Initial release

Patches backported in MR3 1.11

  • HIVE-27600 Reduce filesystem calls in OrcFileMergeOperator
  • HIVE-25561 Killed task should not commit file

Patches backported in MR3 1.9

  • HIVE-27876 Incorrect query results on tables with ClusterBy & SortBy
  • HIVE-27788 Exception when join has 2 Group By operators in the same branch in the same reducer
  • HIVE-27777 CBO fails on multi insert overwrites with common group expression
  • HIVE-24606 Multi-stage materialized CTEs can lose intermediate data
  • HIVE-27494 Deduplicate the task result that generated by more branches in union all
  • HIVE-26968 Wrong results when shared work optimizer merges TS operator with different DPP edges
  • HIVE-25751 Ignore exceptions related to interruption when the limit is reached
  • HIVE-25274 TestLimitOperator fails if default engine is Tez
  • HIVE-24207 LimitOperator can leverage ObjectCache to bail out quickly

Patches backported in MR3 1.8

  • HIVE-21923 Vectorized MapJoin may miss results when only the join key is selected
  • HIVE-21288 Runtime rowcount calculation is incorrect in vectorized executions
  • HIVE-18908 FULL OUTER JOIN to MapJoin
  • HIVE-4605 Hive job fails while closing reducer output - Unable to rename
  • HIVE-27344 ORC RecordReaderImpl throws NPE when close() is called from the constructor
  • HIVE-27649 Support ORDER BY clause in subqueries with set operators
  • HIVE-27303 Set correct output name to ReduceSink when there is a SMB join after Union
  • HIVE-27437 Vectorization: VectorizedOrcRecordReader does not reset VectorizedRowBatch after processing
  • HIVE-24485 Make the slow-start behavior tunable
  • HIVE-25874 Slow filter evaluation of nest struct fields in vectorized executions
  • HIVE-25960 Use RemoteIteratorWithFilter.HIDDEN_FILES_FULL_PATH_FILTER defined in org.apache.hadoop.hive.metastore.utils.FileUtils
  • HIVE-25754 Fix column projection for union all queries with multiple aliases
  • HIVE-25683 Close reader in AcidUtils.isRawFormatFile
  • HIVE-26184 COLLECT_SET with GROUP BY is very slow when some keys are highly skewed
  • HIVE-25794 CombineHiveRecordReader: log statements in a loop leads to memory pressure
  • HIVE-25736 Close ORC readers
  • HIVE-25685 HBaseStorageHandler: ensure that hbase properties are present in final JobConf for Tez
  • HIVE-25577 unix_timestamp() is ignoring the time zone value
  • HIVE-25549 Wrong results for window function with expression in PARTITION BY or ORDER BY clause
  • HIVE-25449 datediff() gives wrong output when run in a tez task with some non-UTC timezone
  • HIVE-25458 Unix_timestamp() with string input give wrong result
  • HIVE-25403 Fix from_unixtime() to consider leap seconds
  • HIVE-25058 PTF: TimestampValueBoundaryScanner can be optimised during range computation pt2 - isDistanceGreater
  • HIVE-25299 Casting timestamp to numeric data types is incorrect for non-UTC timezones
  • HIVE-25085 MetaStore Clients no longer shared across sessions
  • HIVE-25093 date_format() UDF is returning output in UTC time zone only
  • HIVE-25001 Improvement for some debug-logging guards
  • HIVE-24746 PTF: TimestampValueBoundaryScanner can be optimised during range computation
  • HIVE-24882 Compaction task reattempt fails with FileAlreadyExistsException for DeleteEventWriter
  • HIVE-24858 UDFClassLoader leak in Configuration.CACHE_CLASSES
  • HIVE-24808 Cache Parsed Dates
  • HIVE-24693 Convert timestamps to zoned times without string operations
  • HIVE-24353 Performance: do not throw exceptions when parsing Timestamp
  • HIVE-24478 Subquery GroupBy with Distinct SemanticException: Invalid column reference
  • HIVE-24691 Ban commons-logging
  • HIVE-24660 Remove Commons Logger from jdbc-handler Package
  • HIVE-24659 Remove Commons Logger from serde Package
  • HIVE-24613 Support Values clause without Insert
  • HIVE-24435 Vectorized unix_timestamp is inconsistent with non-vectorized counterpart
  • HIVE-24179 Memory leak in HS2 DbTxnManager when compiling SHOW LOCKS statement
  • HIVE-24373 Wrong predicate is pushed down for view with constant value projection.
  • HIVE-24236 Fixed possible Connection leaks in TxnHandler
  • HIVE-24199 Incorrect result when subquey in exists contains limit
  • HIVE-24036 Kryo Exception while serializing plan for getSplits UDF call
  • HIVE-23837 Configure StorageHandlers if FileSinkOperator is child of MergeJoinWork
  • HIVE-23265 Duplicate rowsets are returned with Limit and Offset set
  • HIVE-22684 Run Eclipse Cleanup Against hbase-handler Module
  • HIVE-23058 Compaction task reattempt fails with FileAlreadyExistsException
  • HIVE-22566 Drop table involved in materialized view leaves the table in inconsistent state
  • HIVE-23307 Cache ColumnIndex in HiveBaseResultSet
  • HIVE-22707 MergeJoinWork should be considered while collecting DAG credentials
  • HIVE-22614 Replace Base64 in hive-hbase-handler Package
  • HIVE-22392 Hive JDBC Storage Handler: Support For Writing Data to JDBC Data Source
  • HIVE-22227 Tez bucket pruning produces wrong result with shared work optimization
  • HIVE-22107 Correlated subquery producing wrong schema
  • HIVE-21862 ORC ppd produces wrong result with timestamp
  • HIVE-22008 LIKE Operator should match multi-line input
  • HIVE-15406 Consider vectorizing the new trunc function
  • HIVE-21384 Upgrade to dbcp2 in JDBC storage handler
  • HIVE-21253 Support DB2 in JDBC StorageHandler
  • HIVE-20484 Disable Block Cache By Default With HBase SerDe
  • HIVE-20955 Fix Calcite Rule HiveExpandDistinctAggregatesRule seems throwing IndexOutOfBoundsException
  • HIVE-21026 Druid Vectorize Reader is not using the correct input size
  • HIVE-20842 Fix logic introduced in HIVE-20660 to estimate statistics for group by
  • HIVE-20660 Group by statistics estimation could be improved by bounding the total number of rows to source table
  • HIVE-20684 Make compute stats work for Druid tables

Patches backported in MR3 1.7

  • HIVE-20615 CachedStore: Background refresh thread bug fixes
  • HIVE-21479 NPE during metastore cache update
  • HIVE-27267 choose correct partition columns from bigTableRS
  • HIVE-27138 Extend RSOp to compute filterTag if it has child MapJoinOp
  • HIVE-27184 Add class name profiling option in ProfileServlet
  • HIVE-23891 UNION ALL and multiple task attempts can cause file duplication
  • HIVE-22173 Query with multiple lateral views hangs during compilation
  • HIVE-26882 Allow transactional check of Table parameter before altering the Table
  • HIVE-26683 Sum windowing function returns wrong value when all nulls
  • HIVE-26779 UNION ALL throws SemanticException when trying to remove partition predicates: fail to find child from parent
  • HIVE-27069 Incorrect results with bucket map join
  • HIVE-26676 Count distinct in subquery returning wrong results
  • HIVE-26235 OR Condition on binary column is returning empty result
  • HIVE-25758 OOM due to recursive application of CBO rules
  • HIVE-25909 Add test for ‘hive.default.nulls.last’ property for windows with ordering
  • HIVE-25917 Use default value for ‘hive.default.nulls.last’ when config is not available
  • HIVE-25864 Hive query optimisation creates wrong plan for predicate pushdown with windowing function
  • HIVE-25822 Unexpected result rows in case of outer join contains conditions only affecting one side
  • HIVE-24073 Execution exception in sort-merge semijoin
  • HIVE-21935 Hive Vectorization : degraded performance with vectorize UDF
  • HIVE-24827 Hive aggregation query returns incorrect results for non text files
  • HIVE-14165 Remove unnecessary file listing from FetchOperator
  • HIVE-24245 Vectorized PTF with count and distinct over partition producing incorrect results.
  • HIVE-24113 NPE in GenericUDFToUnixTimeStamp
  • HIVE-24293 Integer overflow in llap collision mask
  • HIVE-24209 Incorrect search argument conversion for NOT BETWEEN operation when vectorization is enabled
  • HIVE-24023 Hive parquet reader can’t read files with length=0
  • HIVE-23873 Querying Hive JDBCStorageHandler table fails with NPE when CBO is off
  • HIVE-23751 QTest: Override #mkdirs() method in ProxyFileSystem To Align After HADOOP-16582
  • HIVE-23774 Reduce log level at aggrColStatsForPartitions in MetaStoreDirectSql.java
  • HIVE-23738 DBLockManager::lock() : Move lock request to debug level
  • HIVE-23706 Fix nulls first sorting behavior
  • HIVE-23592 Routine makeIntPair is Not Correct
  • HIVE-19653 Incorrect predicate pushdown for groupby with grouping sets
  • HIVE-23435 Full outer join result is missing rows
  • HIVE-22903 Vectorized row_number() resets the row number after one batch in case of constant expression in partition clause
  • HIVE-22840 Race condition in formatters of TimestampColumnVector and DateColumnVector
  • HIVE-22898 CharsetDecoder race condition in OrcRecordUpdater
  • HIVE-22629 Avoid Array resizing/recreation by using the construcor/ref instead of iteration/get_i
  • HIVE-22321 Setting default nulls last does not take effect when order direction is specified
  • HIVE-21338 Remove order by and limit for aggregates
  • HIVE-20796 jdbc URL can contain sensitive information that should not be logged
  • HIVE-20423 Set NULLS LAST as the default null ordering
  • HIVE-20246 Configurable collecting stats by using DO_NOT_UPDATE_STATS table property
  • HIVE-20331 Query with union all, lateral view and Join fails with “cannot find parent in the child operator”
  • HIVE-20202 Add profiler endpoint to HS2 and LLAP

Patches backported in MR3 1.6

  • HIVE-21935 Hive Vectorization : degraded performance with vectorize UDF
  • HIVE-24113 NPE in GenericUDFToUnixTimeStamp
  • HIVE-24293 Integer overflow in llap collision mask
  • HIVE-23501 AOOB in VectorDeserializeRow when complex types are converted to primitive types
  • HIVE-23688 fix Vectorization IndexArrayOutOfBoundsException when read null values in map
  • HIVE-26292 GroupByOperator initialization does not clean state
  • HIVE-26743 backport HIVE-24694: Early connection close to release server resources during creating
  • HIVE-26532 Remove logger from critical path in VectorMapJoinInnerLongOperator::processBatch
  • HIVE-25960 Fix S3a recursive listing logic
  • HIVE-24391 Fix FIX TestOrcFile failures in branch-3.1
  • HIVE-24316 Upgrade ORC from 1.5.6 to 1.5.8 in branch-3.1
  • HIVE-26447 Vectorization: wrong results when filter on repeating map key orc table
  • HIVE-25446 Wrong exception thrown if capacity<=0
  • HIVE-24849 Create external table socket timeout when location has large number of files
  • HIVE-25505 Incorrect results with header. skip.header.line.count if first line is blank
  • HIVE-18284 Fix NPE when inserting data with ‘distribute by’ clause with dynpart sort optimization
  • HIVE-24381 Compressed text input returns 0 rows if skip header/footer is mentioned
  • HIVE-23140 Optimise file move in CTAS
  • HIVE-22814 ArrayIndexOutOfBound in the vectorization getDataTypePhysicalVariation
  • HIVE-22805 Vectorization with conditional array or map is not implemented and throws an error
  • HIVE-21341 Sensible defaults : hive.server2.idle.operation.timeout and hive.server2.idle.session.timeout are too high
  • HIVE-22784 Boundary condition to check if there is nothing to truncate in StringExpr functions
  • HIVE-22770 Skip interning of MapWork fields during deserialization
  • HIVE-22733 After disable operation log property in hive, still HS2 saving the operation log
  • HIVE-22713 Constant propagation shouldn’t be done for Join-Fil-RS structure
  • HIVE-22720 Remove Log from HiveConf::getLogIdVar
  • HIVE-22405 Add ColumnVector support for ProlepticCalendar
  • HIVE-22033 HiveServer2: fix delegation token renewal
  • HIVE-22631 Avoid deep copying partition list in listPartitionsByExpr
  • HIVE-22548 Optimise Utilities.removeTempOrDuplicateFiles when moving files to final location
  • HIVE-22632 Improve estimateRowSizeFromSchema
  • HIVE-22599 Query results cache: 733 permissions check is not necessary
  • HIVE-22625 Syntax Error in findPotentialCompactions SQL query for MySql/Postgres
  • HIVE-22551 BytesColumnVector initBuffer should clean vector and length consistently
  • HIVE-22523 The error handler in LlapRecordReader might block if its queue is full
  • HIVE-22505 ClassCastException caused by wrong Vectorized operator selection
  • HIVE-21917 COMPLETED_TXN_COMPONENTS table is never cleaned up unless Compactor runs
  • HIVE-22411 Performance degradation on single row inserts
  • HIVE-22357 Schema mismatch between the Hive table definition and the ‘hive.sql.query’ parameter
  • HIVE-21457 Perf optimizations in ORC split-generation
  • HIVE-21390 BI split strategy does not work for blob stores

Patches backported to Hive 3.1.0 in MR3 1.5

  • HIVE-24948 Enhancing performance of OrcInputFormat.getSplits with bucket pruning
  • HIVE-20001 With doas set to true, running select query as hrt_qa user on external table fails due to permission denied to read /warehouse/tablespace/managed directory.
  • HIVE-19748 Add appropriate null checks to DecimalColumnStatsAggregator
  • HIVE-25665 Checkstyle LGPL files must not be in the release sources/binaries
  • HIVE-25522 NullPointerException in TxnHandler
  • HIVE-25547 Alter view as Select statement should create Authorizable events in HS2
  • HIVE-25726 Upgrade velocity to 2.3 due to CVE-2020-13936
  • HIVE-25839 Upgrade Log4j2 to 2.17.1 due to CVE-2021-44832
  • HIVE-25600 Compaction job creates redundant base/delta folder within base/delta folder
  • HIVE-24324 Remove deprecated API usage from Avro
  • HIVE-24851 Fix reader leak in AvroGenericRecordReader
  • HIVE-24797 Disable validate default values when parsing Avro schemas
  • HIVE-24964 Backport HIVE-22453 to branch-3.1
  • HIVE-24788 Backport HIVE-23338 to branch-3.1
  • HIVE-23338 Bump jackson version to 2.10.0
  • HIVE-24747 Backport HIVE-24569 to branch-3.1
  • HIVE-24653 Race condition between compactor marker generation and get splits.
  • HIVE-22981 DataFileReader is not closed in AvroGenericRecordReader#extractWriterTimezoneFromMetadata
  • HIVE-24436 Fix Avro NULL_DEFAULT_VALUE compatibility issue
  • HIVE-25277 fix slow partition deletion issue by removing duplicated isEmpty checks
  • HIVE-25170 Fix wrong colExprMap generated by SemanticAnalyzer
  • HIVE-24224 Fix skipping header/footer for Hive on Tez on compressed file
  • HIVE-24093 Remove unused hive.debug.localtask
  • HIVE-22412 StatsUtils throw NPE when explain
  • HIVE-23509 Fixing MapJoin Capacity Assertion Error
  • HIVE-23625 use html file extension for HS2 web UI query_page
  • HIVE-22476 Hive datediff function provided inconsistent results when hive.fetch.task.conversion is set to none
  • HIVE-22769 Incorrect query results and query failure during split generation for compressed text files
  • HIVE-5312 Let HiveServer2 run simultaneously in HTTP (over thrift) and Binary (normal thrift transport) mode
  • HIVE-22948 QueryCache: Treat query cache locations as temporary storage
  • HIVE-22762 Leap day is incorrectly parsed during cast in Hive
  • HIVE-22763 0 is accepted in 12-hour format during timestamp cast
  • HIVE-22653 Remove commons-lang leftovers
  • HIVE-22685 Fix TestHiveSqlDateTimeFormatter To Work With New Year 2020
  • HIVE-22511 Fix case of Month token in datetime to string conversion
  • HIVE-22422 Missing documentation from HiveSqlDateTimeFormatter: list of date-based patterns
  • HIVE-21580 Introduce ISO 8601 week numbering SQL:2016 formats
  • HIVE-21579 Introduce more complex SQL:2016 datetime formats
  • HIVE-21578 Introduce SQL:2016 formats FM, FX, and nested strings
  • HIVE-22945 Hive ACID Data Corruption: Update command mess the other column data and produces incorrect result
  • HIVE-21660 Wrong result when union all and later view with explode is used
  • HIVE-22891 Skip PartitionDesc Extraction In CombineHiveRecord For Non-LLAP Execution Mode
  • HIVE-22815 reduce the unnecessary file system object creation in MROutput
  • HIVE-22753 Fix gradual mem leak: Operationlog related appenders should be cleared up on errors
  • HIVE-22400 UDF minute with time returns NULL
  • HIVE-22700 Compactions may leak memory when unauthorized
  • HIVE-22485 Cross product should set the conf in UnorderedPartitionedKVEdgeConfig
  • HIVE-22532 PTFPPD may push limit incorrectly through Rank/DenseRank function
  • HIVE-22507 KeyWrapper comparator create field comparator instances at every comparison
  • HIVE-22435 Exception when using VectorTopNKeyOperator operator
  • HIVE-22513 Constant propagation of casted column in filter ops can cause incorrect results
  • HIVE-22464 Implement support for NULLS FIRST/LAST in TopNKeyOperator
  • HIVE-22433 Hive JDBC Storage Handler: Incorrect results fetched from BOOLEAN and TIMESTAMP DataType From JDBC Data Source
  • HIVE-22421 Improve Logging If Configuration File Not Found
  • HIVE-22425 ReplChangeManager Not Debug Logging Database Name
  • HIVE-22431 Hive JDBC Storage Handler: java.lang.ClassCastException on accessing TINYINT, SMALLINT Data Type From JDBC Data Source
  • HIVE-22406 TRUNCATE TABLE fails due MySQL limitations on limit value
  • HIVE-22315 Support Decimal64 column division with decimal64 scalar
  • HIVE-18415 Lower ‘Updating Partition Stats’ Logging Level
  • HIVE-22398 Remove legacy code that can cause issue with new Yarn releases
  • HIVE-22330 Maximize smallBuffer usage in BytesColumnVector
  • HIVE-22360 MultiDelimitSerDe returns wrong results in last column when the loaded file has more columns than those in table schema
  • HIVE-22391 NPE while checking Hive query results cache
  • HIVE-22373 File Merge tasks fail when containers are reused
  • HIVE-22336 Updates should be pushed to the Metastore backend DB before creating the notification event
  • HIVE-22332 Hive should ensure valid schema evolution settings since ORC-540
  • HIVE-21407 Parquet predicate pushdown is not working correctly for char column types
  • HIVE-22331 unix_timestamp without argument returns timestamp in millisecond instead of second
  • HIVE-21407 Parquet predicate pushdown is not working correctly for char column types
  • HIVE-14302 Tez: Optimized Hashtable can support DECIMAL keys of same precision
  • HIVE-21924 Split text files even if header/footer exists
  • HIVE-22270 Upgrade commons-io to 2.6
  • HIVE-22278 Upgrade log4j to 2.12.1
  • HIVE-22248 Fix statistics persisting issues
  • HIVE-22275 OperationManager.queryIdOperation does not properly clean up multiple queryIds
  • HIVE-22207 Tez SplitGenerator throws NumberFormatException when “dfs.blocksize” on cluster is “128m”
  • HIVE-22273 Access check is failed when a temporary directory is removed
  • HIVE-21987 Hive is unable to read Parquet int32 annotated with decimal
  • HIVE-22208 Column name with reserved keyword is unescaped when query including join on table with mask column is re-written
  • HIVE-22232 NPE when hive.order.columnalignment is set to false
  • HIVE-22243 Align Apache Thrift version to 0.9.3-1 in standalone-metastore as well
  • HIVE-22197 Common Merge join throwing class cast exception.
  • HIVE-22079 Post order walker for iterating over expression tree
  • HIVE-22231 Hive query with big size via knox fails with Broken pipe Write failed
  • HIVE-22145 Avoid optimizations for analyze compute statistics
  • HIVE-20113 Shuffle avoidance: Disable 1-1 edges for sorted shuffle
  • HIVE-22219 Bringing a node manager down blocks restart of LLAP service
  • HIVE-22201 ConvertJoinMapJoin#checkShuffleSizeForLargeTable throws ArrayIndexOutOfBoundsException if no big table is selected
  • HIVE-22210 Vectorization may reuse computation output columns involved in filtering
  • HIVE-22170 from_unixtime and unix_timestamp should use user session time zone
  • HIVE-22059 hive-exec jar doesn’t contain (fasterxml) jackson library
  • HIVE-22182 SemanticAnalyzer populates map which is not used at all
  • HIVE-22169 Tez: SplitGenerator tries to look for plan files which won’t exist for Tez
  • HIVE-22200 Hash collision may cause column resolution to fail
  • HIVE-22055 select count gives incorrect result after loading data from text file
  • HIVE-22204 Beeline option to show/not show execution report
  • HIVE-15956 StackOverflowError when drop lots of partitions
  • HIVE-22164 Vectorized Limit operator returns wrong number of results with offset
  • HIVE-22164 Vectorized Limit operator returns wrong number of results with offset
  • HIVE-21397 BloomFilter for hive Managed [ACID] table does not work as expected
  • HIVE-22178 Parquet FilterPredicate throws CastException after SchemaEvolution
  • HIVE-22106 Remove cross-query synchronization for the partition-eval
  • HIVE-22168 Remove very expensive logging from the llap cache hotpath
  • HIVE-22099 Several date related UDFs can’t handle Julian dates properly since HIVE-20007
  • HIVE-22161 UDF: FunctionRegistry synchronizes on org.apache.hadoop.hive.ql.udf.UDFType class
  • HIVE-22151 Turn off hybrid grace hash join by default
  • HIVE-22102 Reduce HMS call when creating HiveSession
  • HIVE-22148 S3A delegation tokens are not added in the job config of the Compactor.
  • HIVE-22121 Turning on hive.tez.bucket.pruning produce wrong results
  • HIVE-22134 HIVE-22129: Remove glassfish.jersey and mssql-jdbc classes from jdbc-standalone jar
  • HIVE-21698 TezSessionState#ensureLocalResources() causes IndexOutOfBoundsException while localizing resources
  • HIVE-22132 Upgrade commons-lang3 version to 3.9
  • HIVE-22114 insert query for partitioned insert only table failing when all buckets are empty
  • HIVE-22092 Fetch is failing with IllegalArgumentException: No ValidTxnList when refetch is done.
  • HIVE-22094 queries failing with ClassCastException: hive.ql.exec.vector.DecimalColumnVector cannot be cast to hive.ql.exec.vector.Decimal64ColumnVector
  • HIVE-21241 Migrate TimeStamp Parser From Joda Time
  • HIVE-16587 NPE when inserting complex types with nested null values
  • HIVE-22080 Prevent implicit conversion from String/char/varchar to double/decimal
  • HIVE-22040 Drop partition throws exception with ‘Failed to delete parent: File does not exist’ when the partition’s parent path does not exists
  • HIVE-21970 Avoid using RegistryUtils.currentUser()
  • HIVE-21828 Tez: Use a pre-parsed TezConfiguration from DagUtils - Addendum2
  • HIVE-21828 Tez: Use a pre-parsed TezConfiguration from DagUtils - Addendum
  • HIVE-21828 Tez: Use a pre-parsed TezConfiguration from DagUtils
  • HIVE-22115 Prevent the creation of query routing appender if property is set to false
  • HIVE-22120 Fix wrong results/ArrayOutOfBound exception in left outer map joins on specific boundary conditions
  • HIVE-22113 Prevent LLAP shutdown on AMReporter related RuntimeException
  • HIVE-13457 Create HS2 REST API endpoints for monitoring information
  • HIVE-22054 Avoid recursive listing to check if a directory is empty
  • HIVE-22045 HIVE-21711 introduced regression in data load
  • HIVE-21173 Upgrade Apache Thrift to 0.9.3-1
  • HIVE-22009 CTLV with user specified location is not honoured.
  • HIVE-21711 Regression caused by HIVE-21279 for blobstorage fs
  • HIVE-21986 HiveServer Web UI: Setting the Strict-Transport-Security in default response header
  • HIVE-21972 “show transactions” display the header twice
  • HIVE-21973 SHOW LOCKS prints the headers twice
  • HIVE-21224 Upgrade tests JUnit3 to JUnit4
  • HIVE-21976 Offset should be null instead of zero in Calcite HiveSortLimit
  • HIVE-21868 Vectorize CAST…FORMAT
  • HIVE-21928 Fix for statistics annotation in nested AND expressions
  • HIVE-21915 Hive with TEZ UNION ALL and UDTF results in data loss
  • HIVE-19831 Hiveserver2 should skip doAuth checks for CREATE DATABASE/TABLE if database/table already exists
  • HIVE-21927 HiveServer Web UI: Setting the HttpOnly option in the cookies
  • HIVE-15177 Authentication with hive fails when kerberos auth type is set to fromSubject and principal contains _HOST
  • HIVE-21905 Generics improvement around the FetchOperator class
  • HIVE-14737 Problem accessing /logs in a Kerberized Hive Server 2 Web UI
  • HIVE-21902 HiveServer2 UI: jetty response header needs X-Frame-Options
  • HIVE-21746 ArrayIndexOutOfBoundsException during dynamically partitioned hash join, with CBO disabled
  • HIVE-21576 Introduce CAST…FORMAT and limited list of SQL:2016 datetime formats
  • HIVE-19661 switch Hive UDFs to use Re2J regex engine
  • HIVE-21835 Unnecessary null checks in org.apache.hadoop.hive.ql.optimizer.StatsOptimizer
  • HIVE-21815 Stats in ORC file are parsed twice
  • HIVE-21799 NullPointerException in DynamicPartitionPruningOptimization, when join key is on aggregation column
  • HIVE-21796 ArrayWritableObjectInspector.equals can take O(2^nesting_depth) time
  • HIVE-21837 MapJoin is throwing exception when selected column is having completely null values
  • HIVE-21742 Vectorization: CASE result type casting
  • HIVE-21805 HiveServer2: Use the fast ShutdownHookManager APIs
  • HIVE-21834 Avoid unnecessary calls to simplify filter conditions
  • HIVE-21795 Rollup summary row might be missing when a mapjoin is happening on a partitioned table
  • HIVE-21789 HiveFileFormatUtils.getRecordWriter is unnecessary
  • HIVE-21768 JDBC: Strip the default union prefix for un-enclosed UNION queries
  • HIVE-21686 ensure that memory allocator does not evict using brute foce path.
  • HIVE-21717 Rename is failing for directory in move task.
  • HIVE-21681 Describe formatted shows incorrect information for multiple primary keys
  • HIVE-21240 JSON SerDe Re-Write and Fixup timestamp parsing issue
  • HIVE-21700 Hive incremental load going OOM while adding load task to the leaf nodes of the DAG.
  • HIVE-21694 Hive driver wait time is fixed for task getting executed in parallel.
  • HIVE-21685 Wrong simplification in query with multiple IN clauses
  • HIVE-19353 Vectorization: ConstantVectorExpression –> RuntimeException: Unexpected column vector type LIST
  • HIVE-21675 CREATE VIEW IF NOT EXISTS broken
  • HIVE-21669 HS2 throws NPE when HiveStatement.getQueryId is invoked and query is closed concurrently
  • HIVE-21061 CTAS query fails with IllegalStateException for empty source
  • HIVE-21400 Vectorization: LazyBinarySerializeWrite allocates Field() within the loop
  • HIVE-21531 Vectorization: all NULL hashcodes are not computed using Murmur3
  • HIVE-21647 Disable TestReplAcidTablesWithJsonMessage and TestReplicationScenariosAcidTables
  • HIVE-18702 INSERT OVERWRITE TABLE doesn’t clean the table directory before overwriting
  • HIVE-21573 Binary transport shall ignore principal if auth is set to delegationToken
  • HIVE-21372 Use Apache Commons IO To Read Stream To String
  • HIVE-21509 LLAP may cache corrupted column vectors and return wrong query result
  • HIVE-21386 Extend the fetch task enhancement done in HIVE-21279 to make it work with query result cache
  • HIVE-21377 Using Oracle as HMS DB with DirectSQL
  • HIVE-21499 should not remove the function from registry if create command failed with AlreadyExistsException
  • HIVE-21518 GenericUDFOPNotEqualNS does not run in LLAP
  • HIVE-21402 Compaction state remains ‘working’ when major compaction fails
  • HIVE-21230 LEFT OUTER JOIN does not generate transitive IS NOT NULL filter on right side
  • HIVE-21316 Comparision of varchar column and string literal should happen in varchar
  • HIVE-21517 Fix AggregateStatsCache
  • HIVE-21544 Constant propagation corrupts coalesce/case/when expressions during folding
  • HIVE-21467 Remove deprecated junit.framework.Assert imports
  • HIVE-21455 Too verbose logging in AvroGenericRecordReader
  • HIVE-21478 Metastore cache update shall capture exception
  • HIVE-21496 Automatic sizing of unordered buffer can overflow
  • HIVE-21048 Remove needless org.mortbay.jetty from hadoop exclusions
  • HIVE-21183 Interrupt wait time for FileCacheCleanupThread
  • HIVE-21460 ACID: Load data followed by a select * query results in incorrect results
  • HIVE-21468 Case sensitivity in identifier names for JDBC storage handler
  • HIVE-16924 Support distinct in presence of Group By
  • HIVE-21368 Vectorization: Unnecessary Decimal64 -> HiveDecimal conversion
  • HIVE-21336 Creation of PCS_STATS_IDX fails Oracle when NLS_LENGTH_SEMANTICS=char
  • HIVE-21421 HiveStatement.getQueryId throws NPE when query is not running
  • HIVE-21371 Make NonSyncByteArrayOutputStream Overflow Conscious
  • HIVE-21264 Improvements Around CharTypeInfo
  • HIVE-21339 LLAP: Cache hit also initializes an FS object
  • HIVE-20656 Sensible defaults: Map aggregation memory configs are too aggressive
  • HIVE-19968 UDF exception is not throw out
  • HIVE-21294 Vectorization: 1-reducer Shuffle can skip the object hash functions
  • HIVE-21182 Skip setting up hive scratch dir during planning
  • HIVE-21279 Avoid moving/rename operation in FileSink op for SELECT queries
  • HIVE-18920 CBO: Initialize the Janino providers ahead of 1st query
  • HIVE-21363 Ldap auth issue: group filter match should be case insensitive
  • HIVE-21270 A UDTF to show schema (column names and types) of given query
  • HIVE-21329 Custom Tez runtime unordered output buffer size depending on operator pipeline
  • HIVE-21297 Replace all occurences of new Long, Boolean, Double etc with the corresponding .valueOf
  • HIVE-21306 Upgrade HttpComponents to the latest versions similar to what Hadoop has done
  • HIVE-21308 Negative forms of variables are not supported in HPL/SQL
  • HIVE-21295 StorageHandler shall convert date to string using Hive convention
  • HIVE-21296 Dropping varchar partition throw exception
  • HIVE-18890 Lower Logging for “Table not found” Error
  • HIVE-685 add UDFquote
  • HIVE-21252 [Trivial] Use String.equals in LazyTimestamp
  • HIVE-21228 Replace all occurences of new Integer with Integer.valueOf
  • HIVE-21223 CachedStore returns null partition when partition does not exist
  • HIVE-21206 Bootstrap replication is slow as it opens lot of metastore connections
  • HIVE-21009 Adding ability for user to set bind user
  • HIVE-21009 Adding ability for user to set bind user
  • HIVE-21199 Replace all occurences of new Byte with Byte.valueOf
  • HIVE-20295 Remove !isNumber check after failed constant interpretation
  • HIVE-20894 Clean Up JDBC HiveQueryResultSet
  • HIVE-21188 SemanticException for query on view with masked table
  • HIVE-21171 Skip creating scratch dirs for tez if RPC is on
  • HIVE-17020 Aggressive RS dedup can incorrectly remove OP tree branch
  • HIVE-11708 Logical operators raises ClassCastExceptions with NULL
  • HIVE-21134 Hive Build Version as UDF
  • HIVE-21148 Remove Use StandardCharsets Where Possible
  • HIVE-21138 Fix some of the alerts raised by lgtm.com
  • HIVE-16907 “INSERT INTO” overwrite old data when destination table encapsulated by backquote
  • HIVE-20419 Vectorization: Prevent mutation of VectorPartitionDesc after being used in a hashmap key
  • HIVE-20879 Using null in a projection expression leads to CastException
  • HIVE-21099 Do Not Print StackTraces to STDERR in ConditionalResolverMergeFiles
  • HIVE-21107 Cannot find field” error during dynamically partitioned hash join
  • HIVE-20170 Improve JoinOperator “rows for join key” Logging
  • HIVE-21124 HPL/SQL does not support the CREATE TABLE LIKE statement
  • HIVE-21095 Show create table should not display a time zone for timestamp with local time zone
  • HIVE-21104 PTF with nested structure throws ClassCastException
  • HIVE-21113 For HPL/SQL that contains boolean expression with NOT, incorrect SQL may be generated
  • HIVE-21082 In HPL/SQL, declare statement does not support variable of type character
  • HIVE-20159 Do Not Print StackTraces to STDERR in ConditionalResolverSkewJoin
  • HIVE-21085 Materialized views registry starts non-external tez session
  • HIVE-21073 Remove Extra String Object
  • HIVE-20160 Do Not Print StackTraces to STDERR in OperatorFactory
  • HIVE-21033 Forgetting to close operation cuts off any more HiveServer2 output
  • HIVE-20989 JDBC - The GetOperationStatus + log can block query progress via sleep()
  • HIVE-20748 Disable materialized view rewriting when plan pattern is not allowed
  • HIVE-21040 msck does unnecessary file listing at last level of directory tree
  • HIVE-21041 NPE, ParseException in getting schema from logical plan
  • HIVE-20785 Wrong key name in the JDBC DatabaseMetaData.getPrimaryKeys method
  • HIVE-21021 Scalar subquery with only aggregate in subquery (no group by) has unnecessary sq_count_check branch
  • HIVE-21028 Adding a JDO fetch plan for getTableMeta get_table_meta to avoid race condition
  • HIVE-20961 Retire NVL implementation
  • HIVE-21005 LLAP: Reading more stripes per-split leaks ZlibCodecs
  • HIVE-21018 Grouping/distinct on more than 64 columns should be possible
  • HIVE-20979 Fix memory leak in hive streaming
  • HIVE-21013 JdbcStorageHandler fail to find partition column in Oracle
  • HIVE-20985 If select operator inputs are temporary columns vectorization may reuse some of them as output
  • HIVE-20827 Inconsistent results for empty arrays
  • HIVE-20953 Remove a function from function registry when it can not be added to the metastore when creating it.
  • HIVE-20981 streaming/AbstractRecordWriter leaks HeapMemoryMonitor
  • HIVE-18902 Lower Logging Level for Cleaning Up “local RawStore”
  • HIVE-19403 Demote ‘Pattern’ Logging
  • HIVE-19846 Removed Deprecated Calls From FileUtils-getJarFilesByPath
  • HIVE-20161 Do Not Print StackTraces to STDERR in ParseDriver
  • HIVE-20239 Do Not Print StackTraces to STDERR in MapJoinProcessor
  • HIVE-20831 Add Session ID to Operation Logging
  • HIVE-20978 “hive.jdbc.*” should add to sqlStdAuthSafeVarNameRegexes
  • HIVE-20976 JDBC queries containing joins gives wrong results
  • HIVE-20873 Use Murmur hash for VectorHashKeyWrapperTwoLong to reduce hash collision
  • HIVE-20930 VectorCoalesce in FILTER mode doesn’t take effect
  • HIVE-20940 Bridge cases in which Calcite’s type resolution is more stricter than Hive.
  • HIVE-20949 Improve PKFK cardinality estimation in physical planning
  • HIVE-20952 Cleaning VectorizationContext.java
  • HIVE-20818 Views created with a WHERE subquery will regard views referenced in the subquery as direct input
  • HIVE-20940 Bridge cases in which Calcite’s type resolution is more stricter than Hive.
  • HIVE-20937 Postgres jdbc query fail with “LIMIT must not be negative”
  • HIVE-14557 Nullpointer When both SkewJoin and Mapjoin Enabled
  • HIVE-20918 Flag to enable/disable pushdown of computation from Calcite into JDBC connection
  • HIVE-20888 TxnHandler: sort() called on immutable lists
  • HIVE-20910 Insert in bucketed table fails due to dynamic partition sort optimization
  • HIVE-20905 querying streaming table fails with out of memory exception
  • HIVE-20676 HiveServer2: PrivilegeSynchronizer is not set to daemon status
  • HIVE-19701 getDelegationTokenFromMetaStore doesn’t need to be synchronized
  • HIVE-20682 Async query execution can potentially fail if shared sessionHive is closed by master thread
  • HIVE-20893 Fix thread safety issue for bloomK probing filter
  • HIVE-20881 Constant propagation oversimplifies projections
  • HIVE-20886 Fix NPE: GenericUDFLower
  • HIVE-20813 udf to_epoch_milli need to support timestamp without time zone as well.
  • HIVE-16839 Unbalanced calls to openTransaction/commitTransaction when alter the same partition concurrently
  • HIVE-20868 SMB Join fails intermittently when TezDummyOperator has child op in getFinalOp in MapRecordProcessor
  • HIVE-20858 Serializer is not correctly initialized with configuration in Utilities.createEmptyBuckets
  • HIVE-20486 Vectorization support for Kafka Storage Handler
  • HIVE-20839 “Cannot find field” error during dynamically partitioned hash join
  • HIVE-20486 Vectorization support for Kafka Storage Handler
  • HIVE-20796 jdbc URL can contain sensitive information that should not be logged
  • HIVE-20805 Hive does not copy source data when importing as non-hive user
  • HIVE-20817 Reading Timestamp datatype via HiveServer2 gives errors
  • HIVE-20834 Hive QueryResultCache entries keeping reference to SemanticAnalyzer from cached query
  • HIVE-20815 JdbcRecordReader.next shall not eat exception
  • HIVE-20821 Rewrite SUM0 into SUM + COALESCE combination
  • HIVE-20617 Fix type of constants in IN expressions to have correct type
  • HIVE-20830 JdbcStorageHandler range query assertion failure in some cases
  • HIVE-20829 JdbcStorageHandler range split throws NPE
  • HIVE-20820 MV partition on clause position
  • HIVE-20792 Inserting timestamp with zones truncates the data
  • HIVE-20638 Upgrade version of Jetty to 9.3.25.v20180904
  • HIVE-14516 OrcInputFormat.SplitGenerator.callInternal() can be optimized
  • HIVE-18876 Remove Superfluous Logging in Driver
  • HIVE-20490 UDAF: Add an ‘approx_distinct’ to Hive
  • HIVE-20768 Adding Tumbling Window UDF
  • HIVE-20763 Add google cloud storage (gs) to the exim uri schema whitelist
  • HIVE-20762 NOTIFICATION_LOG cleanup interval is hardcoded as 60s and is too small
  • HIVE-20477 OptimizedSql is not shown if the expression contains INs
  • HIVE-20720 Add partition column option to JDBC handler
  • HIVE-20735 Adding Support for Kerberos Auth, Removed start/end offset columns, remove the best effort mode and made 2pc default for EOS
  • HIVE-20761 Select for update on notification_sequence table has retry interval and retries count too small
  • HIVE-20731 keystore file in JdbcStorageHandler should be authorized (Add missing file)
  • HIVE-20731 keystore file in JdbcStorageHandler should be authorized
  • HIVE-20509 Plan: fix wasted memory in plans with large partition counts
  • HIVE-20714 SHOW tblproperties for a single property returns the value in the name column
  • HIVE-20649 LLAP aware memory manager for Orc writers
  • HIVE-20696 msck_*.q tests are broken
  • HIVE-20702 Account for overhead from datastructure aware estimations during mapjoin selection
  • HIVE-20704 Extend HivePreFilteringRule to support other functions
  • HIVE-20385 Date: date + int fails to add days
  • HIVE-20644 Avoid exposing sensitive infomation through a Hive Runtime exception
  • HIVE-20712 HivePointLookupOptimizer should extract deep cases
  • HIVE-20705 Vectorization: Native Vector MapJoin doesn’t support Complex Big Table values
  • HIVE-20678 HiveHBaseTableOutputFormat should implement HiveOutputFormat to ensure compatibility
  • HIVE-20710 Constant folding may not create null constants without types
  • HIVE-20639 Add ability to Write Data from Hive Table/Query to Kafka Topic
  • HIVE-20648 LLAP: Vector group by operator should use memory per executor
  • HIVE-20711 Race Condition when Multi-Threading in SessionState.createRootHDFSDir
  • HIVE-20692 Enable folding of NOT x IS (NOT) [TRUE|FALSE] expressions
  • HIVE-20623 Shared work: Extend sharing of map-join cache entries in LLAP
  • HIVE-20651 JdbcStorageHandler password should be encrypted
  • HIVE-14431 Recognize COALESCE as CASE
  • HIVE-20646 Partition filter condition is not pushed down to metastore query if it has IS NOT NULL
  • HIVE-20652 JdbcStorageHandler push join of two different datasource to jdbc driver
  • HIVE-20563 Vectorization: CASE WHEN expression fails when THEN/ELSE type and result type are different
  • HIVE-20544 TOpenSessionReq logs password and username
  • HIVE-20691 Fix org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[cttl]
  • HIVE-20338 LLAP: Force synthetic file-id for filesystems which have HDFS protocol impls with POSIX mutation semantics
  • HIVE-20657 pre-allocate LLAP cache at init time
  • HIVE-20609 Create SSD cache dir if it doesnt exist already
  • HIVE-20618 During join selection BucketMapJoin might be choosen for non bucketed tables
  • HIVE-10296 Cast exception observed when hive runs a multi join query on metastore (postgres), since postgres pushes the filter into the join, and ignores the condition before applying cast
  • HIVE-20619 Include MultiDelimitSerDe in HiveServer2 By Default
  • HIVE-20637 Allow any udfs with 0 arguments or with constant arguments as part of default clause
  • HIVE-20552 Get Schema from LogicalPlan faster
  • HIVE-20627 Concurrent async queries intermittently fails with LockException and cause memory leak
  • HIVE-19302 Logging Too Verbose For TableNotFound
  • HIVE-20540 Vectorization : Support loading bucketed tables using sorted dynamic partition optimizer - II
  • HIVE-18871 hive on tez execution error due to set hive.aux.jars.path to hdfs://
  • HIVE-20603 “Wrong FS” error when inserting to partition after changing table location filesystem
  • HIVE-20620 manifest collisions when inserting into bucketed sorted MM tables with dynamic partitioning
  • HIVE-20625 Regex patterns not working in SHOW MATERIALIZED VIEWS ‘
  • HIVE-20095 Fix feature to push computation to jdbc external tables
  • HIVE-20621 GetOperationStatus called in resultset.next causing incremental slowness
  • HIVE-20568 There is no need to convert the dbname to pattern while pulling tablemeta
  • HIVE-20507 Beeline: Add a utility command to retrieve all uris from beeline-site.xml
  • HIVE-20498 Support date type for column stats autogather
  • HIVE-20536 Add Surrogate Keys function to Hive
  • HIVE-20570 Fix plan for query with hive.optimize.union.remove set to true
  • HIVE-20583 Use canonical hostname only for kerberos auth in HiveConnection
  • HIVE-20561 Use the position of the Kafka Consumer to track progress instead of Consumer Records offsets
  • HIVE-20494 GenericUDFRestrictInformationSchema is broken after HIVE-19440
  • HIVE-20163 Simplify StringSubstrColStart Initialization
  • HIVE-20558 Change default of hive.hashtable.key.count.adjustment to 0.99
  • HIVE-20462 “CREATE VIEW IF NOT EXISTS” fails if view already exists
  • HIVE-20524 Schema Evolution checking is broken in going from Hive version 2 to version 3 for ALTER TABLE VARCHAR to DECIMAL
  • HIVE-20537 Multi-column joins estimates with uncorrelated columns different in CBO and Hive
  • HIVE-20541 REPL DUMP on external table with add partition event throws NoSuchElementException
  • HIVE-20503 Use datastructure aware estimations during mapjoin selection
  • HIVE-20412 NPE in HiveMetaHook
  • HIVE-20471 issues getting the default database path
  • HIVE-18038 org.apache.hadoop.hive.ql.session.OperationLog - Review
  • HIVE-20296 Improve HivePointLookupOptimizerRule to be able to extract from more sophisticated contexts
  • HIVE-17921 Aggregation with struct in LLAP produces wrong result
  • HIVE-20020 Hive contrib jar should not be in lib
  • HIVE-20481 Add the Kafka Key record as part of the row
  • HIVE-20514 Query with outer join filter is failing with dynamic partition join
  • HIVE-20526 Add test case for HIVE-20489
  • HIVE-20489 Recursive calls to intern path strings causes parse to hang
  • HIVE-20502 Fix NPE while running skewjoin_mapjoin10.q when column stats is used.
  • HIVE-20513 Vectorization: Improve Fast Vector MapJoin Bytes Hash Tables
  • HIVE-20508 Hive does not support user names of type “user@realm”
  • HIVE-20522 HiveFilterSetOpTransposeRule may throw assertion error due to nullability of fields
  • HIVE-20510 Vectorization : Support loading bucketed tables using sorted dynamic partition optimizer
  • HIVE-20515 Empty query results when using results cache and query temp dir, results cache dir in different filesystems
  • HIVE-20499 GetTablesOperation pull all the tables meta irrespective of auth.
  • HIVE-18725 Improve error handling for subqueries if there is wrong column reference
  • HIVE-20236 Clean up printStackTrace() in DDLTask
  • HIVE-15932 Add support for: “explain ast”
  • HIVE-20377 Hive Kafka Storage Handler
  • HIVE-19662 Upgrade Avro to 1.8.2
  • HIVE-19993 Using a table alias which also appears as a column name is not possible
  • HIVE-20432 Rewrite BETWEEN to IN for integer types for stats estimation
  • HIVE-20437 Handle schema evolution from Float, Double and Decimal.
  • HIVE-20491 Fix mapjoin size estimations for Fast implementation
  • HIVE-20395 Parallelize files move in ‘replaceFiles’ method.
  • HIVE-20476 CopyUtils used by REPL LOAD and EXPORT/IMPORT operations ignore distcp error
  • HIVE-20496 Vectorization: Vectorized PTF IllegalStateException
  • HIVE-20466 Improve org.apache.hadoop.hive.ql.exec.FunctionTask Experience
  • HIVE-20225 SerDe to support Teradata Binary Format
  • HIVE-20433 Implicit String to Timestamp conversion is slow
  • HIVE-20465 ProxyFileSystem.listStatusIterator function override required once migrated to Hadoop 3.2.0+
  • HIVE-20467 Allow IF NOT EXISTS/IF EXISTS in Resource plan creation/drop
  • HIVE-20439 addendum
  • HIVE-20439 Use the inflated memory limit during join selection for llap
  • HIVE-20013 Add an Implicit cast to date type for to_date function
  • HIVE-20187 Incorrect query results in hive when hive.convert.join.bucket.mapjoin.tez is set to true
  • HIVE-20455 Log spew from security.authorization.PrivilegeSynchonizer.run
  • HIVE-20315 Vectorization: Fix more NULL / Wrong Results issues and avoid unnecessary casts/conversions
  • HIVE-20339 Vectorization: Lift unneeded restriction causing some PTF with RANK not to be vectorized
  • HIVE-20258 Should Syncronize getInstance in ReplChangeManager
  • HIVE-20352 Vectorization: Support grouping function
  • HIVE-20367 Vectorization: Support streaming for PTF AVG, MAX, MIN, SUM
  • HIVE-20399 CTAS w/a custom table location that is not fully qualified fails for MM tables
  • HIVE-20443 txn stats cleanup in compaction txn handler is unneeded
  • HIVE-20418 LLAP IO may not handle ORC files that have row index disabled correctly for queries with no columns selected
  • HIVE-20409 Hive ACID: Update/delete/merge does not clean hdfs staging directory
  • HIVE-20246 Configurable collecting stats by using DO_NOT_UPDATE_STATS table property
  • HIVE-20237 Do Not Print StackTraces to STDERR in HiveMetaStore
  • HIVE-20366 TPC-DS query78 stats estimates are off for is null filter
  • HIVE-17979 Tez: Improve ReduceRecordSource passDownKey copying
  • HIVE-20406 Addendum patch
  • HIVE-20406 Nested Coalesce giving incorrect results
  • HIVE-20368 Remove VectorTopNKeyOperator lock
  • HIVE-20410 aborted Insert Overwrite on transactional table causes “Not enough history available for…” error
  • HIVE-20400 create table should always use a fully qualified path to avoid potential FS ambiguity
  • HIVE-20321 Vectorization: Cut down memory size of 1 col VectorHashKeyWrapper to <1 CacheLine
  • HIVE-19254 NumberFormatException in MetaStoreUtils.isFastStatsSame
  • HIVE-20391 HiveAggregateReduceFunctionsRule may infer wrong return type when decomposing aggregate function
  • HIVE-14898 HS2 shouldn’t log callstack for an empty auth header error
  • HIVE-20389 NPE in SessionStateUserAuthenticator when authenticator=SessionStateUserAuthenticator
  • HIVE-18620 Improve error message while dropping a table that is part of a materialized view
  • HIVE-20345 Drop database may hang if the tables get deleted from a different call
  • HIVE-20379 Rewriting with partitioned materialized views may reference wrong column
  • HIVE-19316 StatsTask fails due to ClassCastException
  • HIVE-20329 Long running repl load (incr/bootstrap) causing OOM error
  • HIVE-19924 Tag distcp jobs run by Repl Load
  • HIVE-20354 Semijoin hints dont work with merge statements
  • HIVE-20350 Unnecessary value assignment
  • HIVE-20340 Druid Needs Explicit CASTs from Timestamp to STRING when the output of timestamp function is used as String
  • HIVE-20344 PrivilegeSynchronizer for SBA might hit AccessControlException
  • HIVE-20316 Skip external table file listing for create table event
  • HIVE-20336 Masking and filtering policies for materialized views
  • HIVE-20337 CachedStore: getPartitionsByExpr is not populating the partition list correctly
  • HIVE-20279 HiveContextAwareRecordReader slows down Druid Scan queries.
  • HIVE-20136 Code Review of ArchiveUtils Class
  • HIVE-20335 Add tests for materialized view rewriting with composite aggregation functions
  • HIVE-20326 Create constraints with RELY as default instead of NO RELY
  • HIVE-19408 Improve show materialized views statement to show more information about invalidation
  • HIVE-20118 SessionStateUserAuthenticator.getGroupNames() is always empty
  • HIVE-20290 Lazy initialize ArrowColumnarBatchSerDe so it doesn’t allocate buffers during GetSplits
  • HIVE-20278 Druid Scan Query avoid copying from List -> Map -> List
  • HIVE-20277 Vectorization: Case expressions that return BOOLEAN are not supported for FILTER
  • HIVE-19937 Intern fields in MapWork on deserialization
  • HIVE-19097 related equals and in operators may cause inaccurate stats estimations
  • HIVE-20162 Logging cleanup, avoid printing stacktraces to stderr
  • HIVE-20314 Include partition pruning in materialized view rewriting
  • HIVE-20301 Enable vectorization for materialized view rewriting tests
  • HIVE-20302 LLAP: non-vectorized execution in IO ignores virtual columns, including ROW__ID
  • HIVE-20294 Vectorization: Fix NULL / Wrong Results issues in COALESCE / ELT
  • HIVE-20274 HiveServer2 ObjectInspectorFactory leaks for Struct and List object inspectors
  • HIVE-20166 LazyBinaryStruct Warn Level Logging
  • HIVE-20281 SharedWorkOptimizer fails with ‘operator cache contents and actual plan differ’
  • HIVE-20169 Print Final Rows Processed in MapOperator
  • HIVE-20239 Do Not Print StackTraces to STDERR in MapJoinProcessor
  • HIVE-20260 NDV of a column shouldn’t be scaled when row count is changed by filter on another column
  • HIVE-14493 Partitioning support for materialized views
  • HIVE-18201 Disable XPROD_EDGE for sq_count_check() created for scalar subqueries
  • HIVE-20130 Better logging for information schema synchronizer
  • HIVE-20244 forward port HIVE-19704 to master
  • HIVE-20101 BloomKFilter: Avoid using the local byte[] arrays entirely
  • HIVE-19199 ACID: DbTxnManager heartbeat-service needs static sync init
  • HIVE-20040 JDBC: HTTP listen queue is 50 and SYNs are lost
  • HIVE-20177 Vectorization: Reduce KeyWrapper allocation in GroupBy Streaming mode
  • HIVE-19694 Create Materialized View statement should check for MV name conflicts before running MV’s SQL statement.
  • HIVE-20245 Vectorization: Fix NULL / Wrong Results issues in BETWEEN / IN
  • HIVE-20210 Simple Fetch optimizer should lead to MapReduce when filter on non-partition column and conversion is minimal
  • HIVE-20247 cleanup issues in LLAP IO after cache OOM
  • HIVE-20247 cleanup issues in LLAP IO after cache OOM
  • HIVE-20249 LLAP IO: NPE during refCount decrement
  • HIVE-19809 Remove Deprecated Code From Utilities Class
  • HIVE-20168 ReduceSinkOperator Logging Hidden
  • HIVE-20263 Typo in HiveReduceExpressionsWithStatsRule variable
  • HIVE-20209 Metastore connection fails for first attempt in repl dump
  • HIVE-20221 Increase column width for partition_params
  • HIVE-19770 Support for CBO for queries with multiple same columns in select
  • HIVE-19181 Remove BreakableService (unused class)
  • HIVE-20035 write booleans as long when serializing to druid
  • HIVE-20105 Druid-Hive: tpcds query on timestamp throws java.lang.IllegalArgumentException: Cannot create timestamp, parsing error
  • HIVE-18729 Druid Time column type
  • HIVE-20213 Upgrade Calcite to 1.17.0
  • HIVE-20212 Hiveserver2 in http mode emitting metric default.General.open_connections incorrectly
  • HIVE-20153 Count and Sum UDF consume more memory in Hive 2+
  • HIVE-20240 Semijoin Reduction : Use local variable to check for external table condition
  • HIVE-20156 Avoid printing exception stacktrace to STDERR
  • HIVE-20228 configure repl configuration directories based on user running hiveserver2
  • HIVE-18929 The method humanReadableInt in HiveStringUtils.java has a race condition.
  • HIVE-20158 Do Not Print StackTraces to STDERR in Base64TextOutputFormat
  • HIVE-20242 Query results cache: Improve ability of queries to use pending query results
  • HIVE-20203 Arrow SerDe leaks a DirectByteBuffer
  • HIVE-20015 Populate ArrayList with Constructor
  • HIVE-20207 Vectorization: Fix NULL / Wrong Results issues in Filter / Compare
  • HIVE-18852 Misleading error message in alter table validation
  • HIVE-19935 Hive WM session killed: Failed to update LLAP tasks count
  • HIVE-20082 HiveDecimal to string conversion doesn’t format the decimal correctly
  • HIVE-20164 Murmur Hash : Make sure CTAS and IAS use correct bucketing version
  • HIVE-19891 inserting into external tables with custom partition directories may cause data loss
  • HIVE-17683 Add explain locks command
  • HIVE-20192 HS2 with embedded metastore is leaking JDOPersistenceManager objects
  • HIVE-20204 Type conversion during IN () comparisons is using different rules from other comparison operations
  • HIVE-20218 make sure Statement.executeUpdate() returns number of rows affected
  • HIVE-20127 fix some issues with LLAP Parquet cache
  • HIVE-4367 enhance TRUNCATE syntax to drop data of external table
  • HIVE-17896 (addendum) order3.q fails with NullPointerException if hive.cbo.enable=false and hive.optimize.topnkey=true
  • HIVE-17896 TopNKey: Create a standalone vectorizable TopNKey operator
  • HIVE-20149 TestHiveCli failing/timing out
  • HIVE-19360 CBO: Add an “optimizedSQL” to QueryPlan object
  • HIVE-20201 Hive HBaseHandler code should not use deprecated Base64 implementation
  • HIVE-20120 Incremental repl load DAG generation is causing OOM error
  • HIVE-20183 Inserting from bucketed table can cause data loss, if the source table contains empty bucket
  • HIVE-20197 Vectorization: Add DECIMAL_64 testing, add Date/Interval/Timestamp arithmetic, and add more GROUP BY Aggregation tests
  • HIVE-20172 StatsUpdater failed with GSS Exception while trying to connect to remote metastore
  • HIVE-20165 Enable ZLIB for streaming ingest
  • HIVE-20116 TezTask is using parent logger
  • HIVE-20152 reset db state, when repl dump fails, so rename table can be done
  • HIVE-19668 Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken’s and duplicate strings
  • HIVE-19940 Push predicates with deterministic UDFs with RBO
  • HIVE-20174 Vectorization: Fix NULL / Wrong Results issues in GROUP BY Aggregation Functions
  • HIVE-19992 Vectorization: Follow-on to HIVE-19951 –> add call to SchemaEvolution.isOnlyImplicitConversion to disable encoded LLAP I/O for ORC only when data type conversion is not implicit
  • HIVE-20185 Backport HIVE-20111 to branch-3
  • HIVE-20147 Hive streaming ingest is contented on synchronized logging
  • HIVE-20041 ResultsCache: Improve logging for concurrent queries
  • HIVE-20091 Tez: Add security credentials for FileSinkOperator output
  • HIVE-19375 Bad message: ‘transactional'='false’ is no longer a valid property and will be ignored
  • HIVE-20019 Ban commons-logging and log4j
  • HIVE-20088 Beeline config location path is assembled incorrectly
  • HIVE-15974 Support real, double precision and numeric data types
  • HIVE-20069 Fix reoptimization in case of DPP and Semijoin optimization
  • HIVE-19387 Truncate table for Acid tables conflicts with ResultSet cache
  • HIVE-20093 LlapOutputFomatService: Use ArrowBuf with Netty for Accounting
  • HIVE-20129 Revert to position based schema evolution for orc tables
  • HIVE-19765 Add Parquet specific tests to BlobstoreCliDriver
  • HIVE-18545 Add UDF to parse complex types from json
  • HIVE-20100 OpTraits : Select Optraits should stop when a mismatch is detected
  • HIVE-20043 HiveServer2: SessionState has a static sync block around an AtomicBoolean
  • HIVE-20099 Fix logger for LlapServlet
  • HIVE-20184 Backport HIVE-20085 to branch-3
  • HIVE-20098 Statistics: NPE when getting Date column partition statistics
  • HIVE-20182 Backport HIVE-20067 to branch-3
  • HIVE-19850 Dynamic partition pruning in Tez is leading to ‘No work found for tablescan’ error
  • HIVE-20066 hive.load.data.owner is compared to full principal
  • HIVE-20039 Bucket pruning: Left Outer Join on bucketed table gives wrong result
  • HIVE-19860 HiveServer2 ObjectInspectorFactory memory leak with cachedUnionStructObjectInspector
  • HIVE-19326 stats auto gather: incorrect aggregation during UNION queries
  • HIVE-20051 Skip authorization for temp tables
  • HIVE-17840 HiveMetaStore eats exception if transactionalListeners.notifyEvent fail
  • HIVE-20059 Hive streaming should try shade prefix unconditionally on exception
  • HIVE-19951 Vectorization: Need to disable encoded LLAP I/O for ORC when there is data type conversion
  • HIVE-19812 Disable external table replication by default via a configuration property
  • HIVE-20008 Fix second compilation errors in ql
  • HIVE-20038 Update queries on non-bucketed + partitioned tables throws NPE
  • HIVE-19995 Aggregate row traffic for acid tables
  • HIVE-19970 Replication dump has a NPE when table is empty
  • HIVE-20028 Metastore client cache config is used incorrectly
  • HIVE-20004 Wrong scale used by ConvertDecimal64ToDecimal results in incorrect results
  • HIVE-20009 Fix runtime stats for merge statement
  • HIVE-19989 Metastore uses wrong application name for HADOOP2 metrics
  • HIVE-19967 SMB Join : Need Optraits for PTFOperator ala GBY Op
  • HIVE-20011 Move away from append mode in proto logging hook
  • HIVE-18786 NPE in Hive windowing functions
  • HIVE-19404 Revise DDL Task Result Logging
  • HIVE-19829 Incremental replication load should create tasks in execution phase rather than semantic phase
  • HIVE-18140 Partitioned tables statistics can go wrong in basic stats mixed case
  • HIVE-19981 Managed tables converted to external tables by the HiveStrictManagedMigration utility should be set to delete data when the table is dropped
  • HIVE-19888 Misleading “METASTORE_FILTER_HOOK will be ignored” warning from SessionState
  • HIVE-19948 HiveCli is not splitting the command by semicolon properly if quotes are inside the string
  • HIVE-19564 Vectorization: Fix NULL / Wrong Results issues in Arithmetic
  • HIVE-19783 Retrieve only locations in HiveMetaStore.dropPartitionsAndGetLocations
  • HIVE-19870 HCatalog dynamic partition query can fail, if the table path is managed by Sentry
  • HIVE-19718 Adding partitions in bulk also fetches table for each partition
  • HIVE-19866 improve LLAP cache purge
  • HIVE-19663 refactor LLAP IO report generation
  • HIVE-19203 Thread-Safety Issue in HiveMetaStore
  • HIVE-16505 Support “unknown” boolean truth value
  • HIVE-19759 Flaky test: TestRpc#testServerPort
  • HIVE-19432 GetTablesOperation is too slow if the hive has too many databases and tables
  • HIVE-19524 pom.xml typo: “commmons-logging” groupId
  • HIVE-6980 Drop table by using direct sql
  • HIVE-19579 remove HBase transitive dependency that drags in some snapshot
  • HIVE-19609 pointless callstacks in the logs as usual
  • HIVE-19424 NPE In MetaDataFormatters
  • HIVE-18906 Lower Logging for “Using direct SQL”
  • HIVE-19041 Thrift deserialization of Partition objects should intern fields
  • HIVE-18881 Lower Logging for FSStatsAggregator
  • HIVE-18903 Lower Logging Level for ObjectStore
  • HIVE-18880 Change Log to Debug in CombineHiveInputFormat
  • HIVE-19285 Add logs to the subclasses of MetaDataOperation
  • HIVE-18986 Table rename will run java.lang.StackOverflowError in dataNucleus if the table contains large number of columns
  • HIVE-19204 Detailed errors from some tasks are not displayed to the client because the tasks don’t set exception when they fail
  • HIVE-19104 When test MetaStore is started with retry the instances should be independent
  • HIVE-16861 MapredParquetOutputFormat - Save Some Array Allocations
  • HIVE-18827 useless dynamic value exceptions strike back
  • HIVE-19133 HS2 WebUI phase-wise performance metrics not showing correctly
  • HIVE-19263 Improve ugly exception handling in HiveMetaStore
  • HIVE-19265 Potential NPE and hiding actual exception in Hive#copyFiles
  • HIVE-19158 Fix NPE in the HiveMetastore add partition tests
  • HIVE-24331 Add Jenkinsfile for branch-3.1 (#1626)
  • HIVE-19170 Fix TestMiniDruidKafkaCliDriver – addendum patch
  • HIVE-19170 Fix TestMiniDruidKafkaCliDriver
  • HIVE-23323 Add qsplits profile
  • HIVE-23044 Make sure Cleaner doesn’t delete delta directories for running queries
  • HIVE-23088 Using Strings from log4j breaks non-log4j users
  • HIVE-22704 Distribution package incorrectly ships the upgrade.order files from the metastore module
  • HIVE-22708 Fix for HttpTransport to replace String.equals
  • HIVE-22407 Hive metastore upgrade scripts have incorrect (or outdated) comment syntax
  • HIVE-22241 Implement UDF to interpret date/timestamp using its internal representation and Gregorian-Julian hybrid calendar
  • HIVE-21508 ClassCastException when initializing HiveMetaStoreClient on JDK10 or newer
  • HIVE-19667 Remove distribution management tag from pom.xml
  • HIVE-21980 Parsing time can be high in case of deeply nested subqueries
  • HIVE-22105 Update ORC to 1.5.6 in branch-3
  • HIVE-20057 For ALTER TABLE t SET TBLPROPERTIES (‘EXTERNAL'='TRUE’); TBL_TYPE attribute change not reflecting for non-CAPS
  • HIVE-21872 Bucketed tables that load data from data/files/auto_sortmerge_join should be tagged as ‘bucketing_version'='1’
  • HIVE-18874 JDBC: HiveConnection shades log4j interfaces
  • HIVE-21821 Backport HIVE-21739 to branch-3.1
  • HIVE-21786 Update repo URLs in poms - branh 3.1 version
  • HIVE-21755 Backport HIVE-21462 to branch-3 Upgrading SQL server backed metastore when changing data type of a column with constraints
  • HIVE-21758 DBInstall tests broken on master and branch-3.1
  • HIVE-21291 Restore historical way of handling timestamps in Avro while keeping the new semantics at the same time
  • HIVE-21564 Load data into a bucketed table is ignoring partitions specs and loads data into default partition
  • HIVE-20593 Load Data for partitioned ACID tables fails with bucketId out of range: -1
  • HIVE-21600 GenTezUtils.removeSemiJoinOperator may throw out of bounds exception for TS with multiple children
  • HIVE-21613 Queries with join condition having timestamp or timestamp with local time zone literal throw SemanticException
  • HIVE-18624 Parsing time is extremely high (~10 min) for queries with complex select expressions
  • HIVE-21540 Query with join condition having date literal throws SemanticException
  • HIVE-21342 Analyze compute stats for column leave behind staging dir on hdfs
  • HIVE-21290 Restore historical way of handling timestamps in Parquet while keeping the new semantics at the same time
  • HIVE-20126 OrcInputFormat does not pass conf to orc reader options
  • HIVE-21376 Incompatible change in Hive bucket computation
  • HIVE-21236 SharedWorkOptimizer should check table properties
  • HIVE-21156 SharedWorkOptimizer may preserve filter in TS incorrectly
  • HIVE-21039 CURRENT_TIMESTAMP returns value in UTC time zone
  • HIVE-20010 Fix create view over literals
  • HIVE-20420 Provide a fallback authorizer when no other authorizer is in use
  • HIVE-18767 Some alterPartitions invocations throw ‘NumberFormatException: null’
  • HIVE-18778 Needs to capture input/output entities in explain
  • HIVE-20555 HiveServer2: Preauthenticated subject for http transport is not retained for entire duration of http communication in some cases
  • HIVE-20227 Exclude glassfish javax.el dependency
  • HIVE-19027 Make materializations invalidation cache work with multiple active remote metastores
  • HIVE-20102 Add a couple of additional tests for query parsing
  • HIVE-20123 Fix masking tests after HIVE-19617
  • HIVE-20076 ACID: Fix Synthetic ROW__ID generation for vectorized orc readers
  • HIVE-20135 Fix incompatible change in TimestampColumnVector to default to UTC