Known Issues

In all versions of Hive-MR3:

  • For HiveServer2 running in shared session mode, the user should terminate it manually in any of the following cases:
    • The Yarn Application is killed.
    • The DAGAppMaster is killed.
    • The DAGAppMaster fails in the last attempt (after creating mr3.am.max.app.attempts Yarn ApplicationAttempts).

    Ideally HiveServer2 should terminate itself in such a case, but this is not implemented yet.

With --hivesrc1 based on Hive 1.2.2 and --hivesrc3 based on Hive 2.1.1:

  • When multiple TaskAttempts run inside a DAGAppMaster or in a ContainerWorker in Yarn mode, GroupByOperator correctly calculates neither the size of memory assigned to each TaskAttempt nor the size of memory used by a TaskAttempt. As a result, it is hard to predict when GroupByOperator flushes hash tables.

With --hivesrc2 based on Hive 2.3.2:

  • When multiple TaskAttempts run inside a DAGAppMaster, GroupByOperator conservatively estimates the size of memory used by a TaskAttempt. As a result, GroupByOperator flushes hash tables more often than necessary. The user can mitigate this issue by increasing the value for the configuration key hive.map.aggr.hash.force.flush.memory.threshold in hive-site.xml.
  • Due to a bug in Hive 2.3.2 (HIVE-18786), Hive-MR3 occasionally fails with NullPointerException when using windowing functions, especially if multiple queries are running concurrently.