Release 1.3: 2021-??-??

MR3

  • DAGAppMaster correctly reports to MR3Client the time from DAG submission to DAG execution.
  • Introduce mr3.container.localize.python.working.dir.unsafe to localize Python scripts in working directories of ContainerWorkers. Localizing Python scripts is an unsafe operation: 1) Python scripts are shared by all DAGs; 2) once localized, Python scripts are not deleted.
  • The image pull policy specified in mr3.k8s.pod.image.pull.policy applies to init containers as well as ContainerWorker containers.
  • Introduce mr3.auto.scale.out.num.initial.containers which specifies the number of new ContainerWorkers to create in a scale-out operation when no ContainerWorkers are running.
  • Introduce mr3.container.runtime.auto.start.input to automatically start LogicalInputs in RuntimeTasks.
  • On both Hadoop and Kubernetes, there is no limit on the aggregate memory of ContainerWorkers, so MR3 can run in a cluster of any size.
  • Speculative execution works on Vertexes with a single Task.

Hive on MR3

  • Auto parallelism is correctly enabled or disabled according to the result of compiling queries by overriding tez.shuffle-vertex-manager.enable.auto-parallel, so tez.shuffle-vertex-manager.enable.auto-parallel can be set to false.
  • Support the TRANSFORM clause with Python scripts (with mr3.container.localize.python.working.dir.unsafe to true in mr3-site.xml).
  • Introduce hive.mr3.llap.orc.memory.per.thread.mb to specify the memory allocated to each ORC manager in low-level LLAP I/O threads.
  • Support Hive 3.1.3.

Release 1.2: 2020-10-26

MR3

  • Introduce mr3.k8s.pod.worker.init.container.command to execute a shell command in a privileged init container.
  • Introduce mr3.k8s.pod.master.toleration.specs and mr3.k8s.pod.worker.toleration.specs to specify tolerations for DAGAppMaster and ContainerWorker Pods.
  • Setting mr3.dag.queue.scheme to individual properly implements fair scheduling among concurrent DAGs.
  • Introduce mr3.k8s.pod.worker.additional.hostpaths to mount additional hostPath volumes.
  • mr3.k8s.worker.total.max.memory.gb and mr3.k8s.worker.total.max.cpu.cores work okay when autoscaling is enabled.
  • DAGAppMaster and ContainerWorkers can publish Prometheus metrics.
  • The default value of mr3.container.task.failure.num.sleeps is 0.
  • Reduce the log size of DAGAppMaster and ContainerWorker.
  • TaskScheduler can process about twice as many events (TaskSchedulerEventTaskAttemptFinished) per unit time as in MR3 1.1, thus doubling the maximum cluster size that MR3 can manage.
  • Optimize the use of CodecPool shared by concurrent TaskAttempts.
  • The getDags command of MasterControl prints both IDs and names of DAGs.
  • On Kubernetes, the updateResourceLimit command of MasterControl updates the limit on the total resources for all ContainerWorker Pods. The user can further improve resource utilization when autoscaling is enabled.

Hive on MR3

  • Compute the memory size of ContainerWorker correctly when hive.llap.io.allocator.mmap is set to true.
  • Hive expands all system properties in configuration files (such as core-site.xml) before passing to MR3.
  • hive.server2.transport.mode can be set to all (with HIVE-5312).
  • MR3 creates three ServiceAccounts: 1) for Metastore and HiveSever2 Pods; 2) for DAGAppMaster Pod; 3) for ContainerWorker Pods. The user can use IAM roles for ServiceAccounts.
  • Docker containers start as root. In kubernetes/env.sh, DOCKER_USER should be set to root and the service principal name in HIVE_SERVER2_KERBEROS_PRINCIPAL should be root.
  • Support Ranger 2.0.0 and 2.1.0.

Release 1.1: 2020-7-19

MR3

  • Support DAG scheduling schemes (specified by mr3.dag.queue.scheme).
  • Optimize DAGAppMaster by freeing memory for messages to Tasks when fault tolerance is disabled (with mr3.am.task.max.failed.attempts set to 1).
  • Fix a minor memory leak in DaemonTask (which also prevents MR3 from running more than 2^30 DAGs when using the shuffle handler).
  • Improve the chance of assigning TaskAttempts to ContainerWorkers that match location hints.
  • TaskScheduler can use location hints produced by ONE_TO_ONE edges.
  • TaskScheduler can use location hints from HDFS when assigning TaskAttempts to ContainerWorker Pods on Kubernetes (with mr3.convert.container.address.host.name).
  • Introduce mr3.k8s.pod.cpu.cores.max.multiplier to specify the multiplier for the limit of CPU cores.
  • Introduce mr3.k8s.pod.memory.max.multiplier to specify the multiplier for the limit of memory.
  • Introduce mr3.k8s.pod.worker.security.context.sysctls to configure kernel parameters of ContainerWorker Pods using init containers.
  • Support speculative execution of TaskAttempts (with mr3.am.task.concurrent.run.threshold.percent).
  • A ContainerWorker can run multiple shuffle handlers each with a different port. The configuration key mr3.use.daemon.shufflehandler now specifies the number of shuffle handlers in each ContainerWorker.
  • With speculative execution and the use of multiple shuffle handlers in a single ContainerWorker, fetch delays rarely occur.
  • A ContainerWorker Pod can run shuffle handlers in a separate container (with mr3.k8s.shuffle.process.ports).
  • On Kubernetes, DAGAppMaster uses ReplicationController instead of Pod, thus making recovery much faster.
  • On Kubernetes, ConfigMaps mr3conf-configmap-master and mr3conf-configmap-worker survive MR3, so the user should delete them manually.
  • Java 8u251/8u252 can be used on Kubernetes 1.17 and later.

Hive on MR3

  • CrossProductHandler asks MR3 DAGAppMaster to set TEZ_CARTESIAN_PRODUCT_MAX_PARALLELISM (Cf. HIVE-16690, Hive 3/4).
  • Hive 4 on MR3 is stable (currently using 4.0.0-SNAPSHOT).
  • No longer support Hive 1.
  • Ranger uses a local directory (emptyDir volume) for logging.
  • The open file limit for Solr (in Ranger) is not limited to 1024.
  • HiveServer2 and DAGAppMaster create readiness and liveness probes.

Release 1.0: 2020-2-17

MR3

  • Support DAG priority schemes (specified by mr3.dag.priority.scheme) and Vertex priority schemes (specified by mr3.vertex.priority.scheme).
  • Support secure shuffle (using SSL mode) without requiring separate configuration files.
  • ContainerWorker tries to avoid OutOfMemoryErrors by sleeping after a TaskAttempt fails (specified by mr3.container.task.failure.num.sleeps).
  • Errors from InputInitializers are properly passed to MR3Client.
  • MasterControl supports two new commands for gracefully stopping DAGAppMaster and ContainerWorkers.

Hive on MR3

  • Allow fractions for CPU cores (with hive.mr3.resource.vcores.divisor).
  • Support rolling updates.
  • Hive on MR3 can access S3 using AWS credentials (with or without Helm).
  • On Amazon EKS, the user can use S3 instead of PersistentVolumes on EFS.
  • Hive on MR3 can use environment variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY to access S3 outside Amazon AWS.

Release 0.11: 2019-12-4

MR3

  • Support autoscaling.

Hive on MR3

  • Memory and CPU cores for Tasks can be set to zero.
  • Support autoscaling on Amazon EMR.
  • Support autoscaling on Amazon EKS.

Release 0.10: 2019-10-18

MR3

  • TaskScheduler supports a new scheduling policy (specified by mr3.taskattempt.queue.scheme) which significantly improves the throughput for concurrent queries.
  • DAGAppMaster recovers from OutOfMemoryErrors due to the exhaustion of threads.

Hive on MR3

  • Compaction sends DAGs to MR3, instead of MapReduce, when hive.mr3.compaction.using.mr3 is set to true.
  • LlapDecider asks MR3 DAGAppMaster for the number of Reducers.
  • ConvertJoinMapJoin asks MR3 DAGAppMaster for the currrent number of Nodes to estimate the cost of Bucket Map Join.
  • Support Hive 3.1.2 and 2.3.6.
  • Support Helm charts.
  • Compaction works okay on Kubernetes.

Release 0.9: 2019-7-25

MR3

  • Each DAG uses its own ClassLoader.

Hive on MR3

  • LLAP I/O works properly on Kubernetes.
  • UDFs work okay on Kubernetes.

Release 0.8: 2019-6-22

MR3

  • A new DAGAppMaster properly recovers DAGs that have not been completed in the previous DAGAppMaster.
  • Fault tolerance after fetch failures works much faster.
  • On Kubernetes, the shutdown handler of DAGAppMaster deletes all running Pods.
  • On both Yarn and Kubernetes, MR3Client automatically connects to a new DAGAppMaster after an initial DAGAppMaster is killed.

Hive on MR3

  • Hive 3 for MR3 supports high availability on Yarn via ZooKeeper.
  • On both Yarn and Kubernetes, multiple HiveServer2 instances can share a common MR3 DAGAppMaster (and thus all its ContainerWorkers as well).
  • Support Apache Ranger on Kubernetes.
  • Support Timeline Server on Kubernetes.

Release 0.7: 2019-4-26

MR3

  • Resolve deadlock when Tasks fail or ContainerWorkers are killed.
  • Support fault tolerance after fetch failures.
  • Support node blacklisting.

Hive on MR3

  • Introduce a new configuration key hive.mr3.am.task.max.failed.attempts.
  • Apply HIVE-20618.

Release 0.6: 2019-3-21

MR3

  • DAGAppMaster can run in its own Pod on Kubernetes.
  • Support elastic execution of RuntimeTasks in ContainerWorkers.
  • MR3-UI requires only Timeline Server.

Hive on MR3

  • Support memory monitoring when loading hash tables for Map-side join.

Release 0.5: 2019-2-18

MR3

  • Support Kubernetes.
  • Support the use of the built-in shuffle handler.

Hive on MR3

  • Support Hive 3.1.1 and 2.3.5.
  • Initial release for Hive on MR3 on Kubernetes

Release 0.4: 2018-10-29

MR3

  • Support auto parallelism for reducers with ONE_TO_ONE edges.
  • Auto parallelism can use input statistics when reassigning partitions to reducers.
  • Support ByteBuffer sharing among RuntimeTasks.

Hive on MR3

  • Support Hive 3.1.0.
  • Hive 1 uses Tez 0.9.1.
  • Metastore checks the inclusion of __HIVE_DEFAULT_PARTITION__ when retrieving column statistics.
  • MR3JobMonitor returns immediately from MR3 DAGAppMaster when the DAG completes.

Release 0.3: 2018-8-15

MR3

  • Extend the runtime to support Hive 3.

Hive on MR3

  • Support Hive 3.0.0.
  • Support query re-execution.
  • Support per-query cache in Hive 2 and 3.

Release 0.2: 2018-5-18

MR3

  • Support asynchronous logging (with mr3.async.logging in mr3-site.xml).
  • Delete DAG-local directories after each DAG is finished.

Hive on MR3

  • Support LLAP I/O for Hive 2.
  • Support Hive 2.2.0.
  • Use Hive 2.3.3 instead of Hive 2.3.2.

Release 0.1: 2018-3-31

MR3

  • Initial release

Hive on MR3

  • Initial release

Patches backported to Hive 3 on MR3 since Hive 3.1.0

  • HIVE-5312 Let HiveServer2 run simultaneously in HTTP (over thrift) and Binary (normal thrift transport) mode
  • HIVE-22815 reduce the unnecessary file system object creation in MROutput
  • HIVE-22891 Skip PartitionDesc Extraction In CombineHiveRecord For Non-LLAP Execution Mode
  • HIVE-22485 Cross product should set the conf in UnorderedPartitionedKVEdgeConfig
  • HIVE-21329 Custom Tez runtime unordered output buffer size depending on operator pipeline
  • HIVE-21171 Skip creating scratch dirs for tez if RPC is on
  • HIVE-21041 NPE, ParseException in getting schema from logical plan
  • HIVE-20979 Fix memory leak in hive streaming
  • HIVE-20827 Inconsistent results for empty arrays
  • HIVE-20953 Remove a function from function registry when it can not be added to the metastore when creating it.
  • HIVE-20873 Use Murmur hash for VectorHashKeyWrapperTwoLong to reduce hash collision
  • HIVE-20937 Postgres jdbc query fail with “LIMIT must not be negative”
  • HIVE-20676 HiveServer2: PrivilegeSynchronizer is not set to daemon status
  • HIVE-19701 getDelegationTokenFromMetaStore doesn’t need to be synchronized
  • HIVE-20881 Constant propagation oversimplifies projections
  • HIVE-20813 udf to_epoch_milli need to support timestamp without time zone as well.
  • HIVE-16839 Unbalanced calls to openTransaction/commitTransaction when alter the same partition concurrently
  • HIVE-20868 SMB Join fails intermittently when TezDummyOperator has child op in getFinalOp in MapRecordProcessor
  • HIVE-20817 Reading Timestamp datatype via HiveServer2 gives errors
  • HIVE-20834 Hive QueryResultCache entries keeping reference to SemanticAnalyzer from cached query
  • HIVE-20815 JdbcRecordReader.next shall not eat exception
  • HIVE-20821 Rewrite SUM0 into SUM + COALESCE combination
  • HIVE-20830 JdbcStorageHandler range query assertion failure in some cases
  • HIVE-20829 JdbcStorageHandler range split throws NPE
  • HIVE-20820 MV partition on clause position
  • HIVE-20792 Inserting timestamp with zones truncates the data
  • HIVE-20638 Upgrade version of Jetty to 9.3.25.v20180904
  • HIVE-20763 Add google cloud storage (gs) to the exim uri schema whitelist
  • HIVE-20762 NOTIFICATION_LOG cleanup interval is hardcoded as 60s and is too small
  • HIVE-20720 Add partition column option to JDBC handler
  • HIVE-20761 Select for update on notification_sequence table has retry interval and retries count too small
  • HIVE-20696 msck_*.q tests are broken
  • HIVE-20644 Avoid exposing sensitive infomation through a Hive Runtime exception
  • HIVE-20705 Vectorization: Native Vector MapJoin doesn’t support Complex Big Table values
  • HIVE-20710 Constant folding may not create null constants without types
  • HIVE-20649 LLAP aware memory manager for Orc writers
  • HIVE-20648 LLAP: Vector group by operator should use memory per executor
  • HIVE-20692 Enable folding of NOT x IS (NOT) [TRUE|FALSE] expressions
  • HIVE-20651 JdbcStorageHandler password should be encrypted
  • HIVE-20652 JdbcStorageHandler push join of two different datasource to jdbc driver
  • HIVE-20618 During join selection BucketMapJoin might be choosen for non bucketed tables
  • HIVE-20552 Get Schema from LogicalPlan faster
  • HIVE-18871 hive on tez execution error due to set hive.aux.jars.path to hdfs://
  • HIVE-20620 manifest collisions when inserting into bucketed sorted MM tables with dynamic partitioning
  • HIVE-20625 Regex patterns not working in SHOW MATERIALIZED VIEWS ‘
  • HIVE-20095 Fix feature to push computation to jdbc external tables
  • HIVE-20568 There is no need to convert the dbname to pattern while pulling tablemeta
  • HIVE-20507 Beeline: Add a utility command to retrieve all uris from beeline-site.xml
  • HIVE-20498 Support date type for column stats autogather
  • HIVE-20583 Use canonical hostname only for kerberos auth in HiveConnection
  • HIVE-20558 Change default of hive.hashtable.key.count.adjustment to 0.99
  • Missed files HIVE-20524: Schema Evolution checking is broken in going from Hive version 2 to version 3 for ALTER TABLE VARCHAR to DECIMAL
  • HIVE-20524 Schema Evolution checking is broken in going from Hive version 2 to version 3 for ALTER TABLE VARCHAR to DECIMAL
  • HIVE-20541 REPL DUMP on external table with add partition event throws NoSuchElementException
  • HIVE-20503 Use datastructure aware estimations during mapjoin selection
  • HIVE-20412 NPE in HiveMetaHook
  • HIVE-20296 Improve HivePointLookupOptimizerRule to be able to extract from more sophisticated contexts
  • HIVE-17921 Aggregation with struct in LLAP produces wrong result
  • HIVE-20513 Vectorization: Improve Fast Vector MapJoin Bytes Hash Tables
  • HIVE-20508 Hive does not support user names of type “user@realm”
  • HIVE-20522 HiveFilterSetOpTransposeRule may throw assertion error due to nullability of fields
  • HIVE-20510 Vectorization : Support loading bucketed tables using sorted dynamic partition optimizer
  • HIVE-20515 Empty query results when using results cache and query temp dir, results cache dir in different filesystems
  • HIVE-20499 GetTablesOperation pull all the tables meta irrespective of auth.
  • HIVE-15932 Add support for: “explain ast”
  • HIVE-19993 Using a table alias which also appears as a column name is not possible
  • HIVE-20491 Fix mapjoin size estimations for Fast implementation
  • HIVE-20476 CopyUtils used by REPL LOAD and EXPORT/IMPORT operations ignore distcp error
  • HIVE-20496 Vectorization: Vectorized PTF IllegalStateException
  • HIVE-20433 Implicit String to Timestamp conversion is slow
  • HIVE-20439 addendum
  • HIVE-20439 Use the inflated memory limit during join selection for llap
  • HIVE-20187 Incorrect query results in hive when hive.convert.join.bucket.mapjoin.tez is set to true
  • HIVE-20455 Log spew from security.authorization.PrivilegeSynchonizer.run
  • HIVE-20315 Vectorization: Fix more NULL / Wrong Results issues and avoid unnecessary casts/conversions
  • HIVE-20339 Vectorization: Lift unneeded restriction causing some PTF with RANK not to be vectorized
  • HIVE-20352 Vectorization: Support grouping function
  • HIVE-20367 Vectorization: Support streaming for PTF AVG, MAX, MIN, SUM
  • HIVE-20399 CTAS w/a custom table location that is not fully qualified fails for MM tables
  • HIVE-20418 LLAP IO may not handle ORC files that have row index disabled correctly for queries with no columns selected
  • HIVE-20409 Hive ACID: Update/delete/merge does not clean hdfs staging directory
  • HIVE-20406 Addendum patch
  • HIVE-20406 Nested Coalesce giving incorrect results
  • HIVE-20410 aborted Insert Overwrite on transactional table causes “Not enough history available for…” error
  • HIVE-20321 Vectorization: Cut down memory size of 1 col VectorHashKeyWrapper to <1 CacheLine
  • HIVE-20391 HiveAggregateReduceFunctionsRule may infer wrong return type when decomposing aggregate function
  • HIVE-14898 HS2 shouldn’t log callstack for an empty auth header error
  • HIVE-20389 NPE in SessionStateUserAuthenticator when authenticator=SessionStateUserAuthenticator
  • HIVE-18620 Improve error message while dropping a table that is part of a materialized view
  • HIVE-20345 Drop database may hang if the tables get deleted from a different call
  • HIVE-20379 Rewriting with partitioned materialized views may reference wrong column
  • HIVE-19316 StatsTask fails due to ClassCastException
  • HIVE-20329 Long running repl load (incr/bootstrap) causing OOM error
  • HIVE-19924 Tag distcp jobs run by Repl Load
  • HIVE-20354 Semijoin hints dont work with merge statements
  • HIVE-20340 Druid Needs Explicit CASTs from Timestamp to STRING when the output of timestamp function is used as String
  • HIVE-20344 PrivilegeSynchronizer for SBA might hit AccessControlException
  • HIVE-20316 Skip external table file listing for create table event
  • HIVE-20336 Masking and filtering policies for materialized views
  • HIVE-20337 CachedStore: getPartitionsByExpr is not populating the partition list correctly
  • HIVE-20335 Add tests for materialized view rewriting with composite aggregation functions
  • HIVE-20326 Create constraints with RELY as default instead of NO RELY
  • HIVE-19408 Improve show materialized views statement to show more information about invalidation
  • HIVE-20118 SessionStateUserAuthenticator.getGroupNames() is always empty
  • HIVE-20290 Lazy initialize ArrowColumnarBatchSerDe so it doesn’t allocate buffers during GetSplits
  • HIVE-20277 Vectorization: Case expressions that return BOOLEAN are not supported for FILTER
  • HIVE-20314 Include partition pruning in materialized view rewriting
  • HIVE-20301 Enable vectorization for materialized view rewriting tests
  • HIVE-20302 LLAP: non-vectorized execution in IO ignores virtual columns, including ROW__ID
  • HIVE-20294 Vectorization: Fix NULL / Wrong Results issues in COALESCE / ELT
  • HIVE-20281 SharedWorkOptimizer fails with ‘operator cache contents and actual plan differ’
  • HIVE-14493 Partitioning support for materialized views
  • HIVE-18201 Disable XPROD_EDGE for sq_count_check() created for scalar subqueries
  • HIVE-20130 Better logging for information schema synchronizer
  • HIVE-20244 forward port HIVE-19704 to master
  • HIVE-20101 BloomKFilter: Avoid using the local byte[] arrays entirely
  • HIVE-20177 Vectorization: Reduce KeyWrapper allocation in GroupBy Streaming mode
  • HIVE-20245 Vectorization: Fix NULL / Wrong Results issues in BETWEEN / IN
  • HIVE-20247 cleanup issues in LLAP IO after cache OOM ADDENDUM
  • HIVE-20247 cleanup issues in LLAP IO after cache OOM
  • HIVE-20249 LLAP IO: NPE during refCount decrement
  • HIVE-20263 Typo in HiveReduceExpressionsWithStatsRule variable
  • HIVE-20035 write booleans as long when serializing to druid
  • HIVE-20105 Druid-Hive: tpcds query on timestamp throws java.lang.IllegalArgumentException: Cannot create timestamp, parsing error
  • HIVE-18729 Druid Time column type
  • HIVE-20213 Upgrade Calcite to 1.17.0
  • HIVE-20212 Hiveserver2 in http mode emitting metric default.General.open_connections incorrectly
  • HIVE-20240 Semijoin Reduction : Use local variable to check for external table condition
  • HIVE-20228 configure repl configuration directories based on user running hiveserver2
  • HIVE-20203 Arrow SerDe leaks a DirectByteBuffer
  • HIVE-20207 Vectorization: Fix NULL / Wrong Results issues in Filter / Compare
  • HIVE-19935 Hive WM session killed: Failed to update LLAP tasks count
  • HIVE-20082 HiveDecimal to string conversion doesn’t format the decimal correctly
  • HIVE-20164 Murmur Hash : Make sure CTAS and IAS use correct bucketing version
  • HIVE-19891 inserting into external tables with custom partition directories may cause data loss
  • HIVE-17683 Add explain locks command
  • HIVE-20192 HS2 with embedded metastore is leaking JDOPersistenceManager objects
  • HIVE-20204 Type conversion during IN () comparisons is using different rules from other comparison operations
  • HIVE-20127 fix some issues with LLAP Parquet cache
  • HIVE-4367 enhance TRUNCATE syntax to drop data of external table
  • HIVE-20149 TestHiveCli failing/timing out
  • HIVE-19360 CBO: Add an “optimizedSQL” to QueryPlan object
  • HIVE-20120 Incremental repl load DAG generation is causing OOM error
  • HIVE-20183 Inserting from bucketed table can cause data loss, if the source table contains empty bucket
  • HIVE-20197 Vectorization: Add DECIMAL_64 testing, add Date/Interval/Timestamp arithmetic, and add more GROUP BY Aggregation tests
  • HIVE-20172 StatsUpdater failed with GSS Exception while trying to connect to remote metastore
  • HIVE-20165 Enable ZLIB for streaming ingest
  • HIVE-20116 TezTask is using parent logger
  • HIVE-20152 reset db state, when repl dump fails, so rename table can be done
  • HIVE-20174 Vectorization: Fix NULL / Wrong Results issues in GROUP BY Aggregation Functions
  • HIVE-19992 Vectorization: Follow-on to HIVE-19951 –> add call to SchemaEvolution.isOnlyImplicitConversion to disable encoded LLAP I/O for ORC only when data type conversion is not implicit
  • HIVE-19951 Vectorization: Need to disable encoded LLAP I/O for ORC when there is data type conversion (Schema Evolution)
  • HIVE-20185 Backport HIVE-20111 to branch-3
  • HIVE-20147 Hive streaming ingest is contented on synchronized logging
  • HIVE-20019 Ban commons-logging and log4j
  • HIVE-20088 Beeline config location path is assembled incorrectly
  • HIVE-19387 Truncate table for Acid tables conflicts with ResultSet cache
  • HIVE-20093 LlapOutputFomatService: Use ArrowBuf with Netty for Accounting
  • HIVE-20129 Revert to position based schema evolution for orc tables
  • HIVE-20100 OpTraits : Select Optraits should stop when a mismatch is detected
  • HIVE-20099 Fix logger for LlapServlet
  • HIVE-20184 Backport HIVE-20085 to branch-3
  • HIVE-20098 Statistics: NPE when getting Date column partition statistics
  • HIVE-20182 Backport HIVE-20067 to branch-3
  • HIVE-19850 Dynamic partition pruning in Tez is leading to ‘No work found for tablescan’ error
  • HIVE-20039 Bucket pruning: Left Outer Join on bucketed table gives wrong result
  • HIVE-19860 HiveServer2 ObjectInspectorFactory memory leak with cachedUnionStructObjectInspector
  • HIVE-19326 stats auto gather: incorrect aggregation during UNION queries (may lead to incorrect results)
  • HIVE-20051 Skip authorization for temp tables
  • HIVE-17840 HiveMetaStore eats exception if transactionalListeners.notifyEvent fail
  • HIVE-20059 Hive streaming should try shade prefix unconditionally on exception
  • HIVE-19812 Disable external table replication by default via a configuration property
  • HIVE-20008 Fix second compilation errors in ql
  • HIVE-20038 Update queries on non-bucketed + partitioned tables throws NPE
  • HIVE-19995 Aggregate row traffic for acid tables
  • HIVE-19970 Replication dump has a NPE when table is empty
  • HIVE-20028 Metastore client cache config is used incorrectly
  • HIVE-20004 Wrong scale used by ConvertDecimal64ToDecimal results in incorrect results
  • HIVE-20009 Fix runtime stats for merge statement
  • HIVE-19989 Metastore uses wrong application name for HADOOP2 metrics
  • HIVE-19967 SMB Join : Need Optraits for PTFOperator ala GBY Op
  • HIVE-20011 Move away from append mode in proto logging hook
  • HIVE-18786 NPE in Hive windowing functions
  • HIVE-19829 Incremental replication load should create tasks in execution phase rather than semantic phase
  • HIVE-18140 Partitioned tables statistics can go wrong in basic stats mixed case
  • HIVE-20231 Backport HIVE-19981 to branch-3
  • HIVE-19564 Vectorization: Fix NULL / Wrong Results issues in Arithmetic
  • HIVE-19663 refactor LLAP IO report generation
  • HIVE-19759 Flaky test: TestRpc#testServerPort
  • HIVE-19432 GetTablesOperation is too slow if the hive has too many databases and tables
  • HIVE-6980 Drop table by using direct sql
  • HIVE-18906 Lower Logging for “Using direct SQL”
  • HIVE-19285 Add logs to the subclasses of MetaDataOperation
  • HIVE-18986 Table rename will run java.lang.StackOverflowError in dataNucleus if the table contains large number of columns
  • HIVE-19104 When test MetaStore is started with retry the instances should be independent
  • HIVE-18827 useless dynamic value exceptions strike back
  • HIVE-24331 Add Jenkinsfile for branch-3.1 (#1626)
  • HIVE-19170 Fix TestMiniDruidKafkaCliDriver – addendum patch
  • HIVE-19170 Fix TestMiniDruidKafkaCliDriver
  • HIVE-23323 Add qsplits profile
  • HIVE-23044 Make sure Cleaner doesn’t delete delta directories for running queries
  • HIVE-23088 Using Strings from log4j breaks non-log4j users
  • HIVE-22704 Distribution package incorrectly ships the upgrade.order files from the metastore module
  • HIVE-22708 Fix for HttpTransport to replace String.equals
  • HIVE-22407 Hive metastore upgrade scripts have incorrect (or outdated) comment syntax
  • HIVE-22241 Implement UDF to interpret date/timestamp using its internal representation and Gregorian-Julian hybrid calendar
  • HIVE-21508 ClassCastException when initializing HiveMetaStoreClient on JDK10 or newer
  • HIVE-19667 Remove distribution management tag from pom.xml
  • HIVE-21980 Parsing time can be high in case of deeply nested subqueries
  • HIVE-22105 Update ORC to 1.5.6 in branch-3
  • HIVE-20057 For ALTER TABLE t SET TBLPROPERTIES (‘EXTERNAL'='TRUE’); TBL_TYPE attribute change not reflecting for non-CAPS
  • HIVE-18874 JDBC: HiveConnection shades log4j interfaces
  • HIVE-21821 Backport HIVE-21739 to branch-3.1
  • HIVE-21786 Update repo URLs in poms - branh 3.1 version
  • HIVE-21755 Backport HIVE-21462 to branch-3 Upgrading SQL server backed metastore when changing data type of a column with constraints
  • HIVE-21758 DBInstall tests broken on master and branch-3.1
  • HIVE-21291 Restore historical way of handling timestamps in Avro while keeping the new semantics at the same time
  • HIVE-21564 Load data into a bucketed table is ignoring partitions specs and loads data into default partition
  • HIVE-20593 Load Data for partitioned ACID tables fails with bucketId out of range: -1
  • HIVE-21600 GenTezUtils.removeSemiJoinOperator may throw out of bounds exception for TS with multiple children
  • HIVE-21613 Queries with join condition having timestamp or timestamp with local time zone literal throw SemanticException
  • HIVE-18624 Parsing time is extremely high (~10 min) for queries with complex select expressions
  • HIVE-21540 Query with join condition having date literal throws SemanticException
  • HIVE-21342 Analyze compute stats for column leave behind staging dir on hdfs
  • HIVE-21290 Restore historical way of handling timestamps in Parquet while keeping the new semantics at the same time
  • HIVE-20126 OrcInputFormat does not pass conf to orc reader options
  • HIVE-21376 Incompatible change in Hive bucket computation
  • HIVE-21236 SharedWorkOptimizer should check table properties
  • HIVE-21156 SharedWorkOptimizer may preserve filter in TS incorrectly
  • HIVE-21039 CURRENT_TIMESTAMP returns value in UTC time zone
  • HIVE-20010 Fix create view over literals
  • HIVE-20420 Provide a fallback authorizer when no other authorizer is in use
  • HIVE-18767 Some alterPartitions invocations throw ‘NumberFormatException: null’
  • HIVE-18778 Needs to capture input/output entities in explain
  • HIVE-20555 HiveServer2: Preauthenticated subject for http transport is not retained for entire duration of http communication in some cases
  • HIVE-20227 Exclude glassfish javax.el dependency
  • HIVE-19027 Make materializations invalidation cache work with multiple active remote metastores (addendum)
  • HIVE-20102 Add a couple of additional tests for query parsing
  • HIVE-20123 Fix masking tests after HIVE-19617
  • HIVE-20076 ACID: Fix Synthetic ROW__ID generation for vectorized orc readers (adduendum)
  • HIVE-20076 ACID: Fix Synthetic ROW__ID generation for vectorized orc readers
  • HIVE-20135 Fix incompatible change in TimestampColumnVector to default to UTC