Known Issues
In all versions of Hive on MR3:
- For HiveServer2 running in shared session mode, the user should terminate it manually in any of the following cases:
- The Yarn Application is killed.
- The DAGAppMaster is killed.
- The DAGAppMaster fails in the last attempt (after creating
mr3.am.max.app.attempts
Yarn ApplicationAttempts).
Ideally HiveServer2 should terminate itself in such a case, but this is not implemented yet.
With --hivesrc1
:
- When multiple TaskAttempts run inside a DAGAppMaster or in a ContainerWorker in Yarn mode, GroupByOperator correctly calculates neither the size of memory assigned to each TaskAttempt nor the size of memory used by a TaskAttempt. As a result, it is hard to predict when GroupByOperator flushes hash tables.
With --hivesrc2
and --hivesrc5
:
- When multiple TaskAttempts run inside a DAGAppMaster,
GroupByOperator conservatively estimates the size of memory used by a TaskAttempt.
As a result, GroupByOperator flushes hash tables more often than necessary.
The user can mitigate this issue by increasing the value for the configuration key
hive.map.aggr.hash.force.flush.memory.threshold
inhive-site.xml
.
- Previous
- Next