Configuring Hive on MR3

The behavior of Hive-MR3 is specified by the configuration file hive-site.xml in the classpath. Below we describe configuration keys relevant to Hive-MR3.

Name Default value Description
hive.execution.engine mr Should be set to mr3 to use MR3 as the execution engine.
hive.execution.mode container Hive-MR3 based on Hive 2 and 3 supports both container or llap. Use container for stable execution and llap for fast execution.
hive.mr3.client.connect.timeout 60000ms Timeout for Hive-MR3 to establish connection to MR3 DAGAppMaster
hive.mr3.map.task.memory.mb -1 Memory in MB allocated to each mapper. If set to -1, Hive-MR3 reads MRJobConfig.MAP_MEMORY_MB.
hive.mr3.reduce.task.memory.mb -1 Memory in MB allocated to each reducer. If set to -1, Hive-MR3 reads MRJobConfig.REDUCE_MEMORY_MB.
hive.mr3.map.task.vcores -1 Number of cores allocated to each mapper. If set to -1, Hive-MR3 reads MRJobConfig.MAP_CPU_VCORES.
hive.mr3.reduce.task.vcores -1 Number of cores allocated to each reducer. If set to -1, Hive-MR3 reads MRJobConfig.REDUCE_CPU_VCORES.
hive.mr3.container.max.java.heap.fraction 0.8f Fraction of task memory to be used as Java heap. Fixed at the time of creating each MR3Session. Used to set the configuration key mr3.container.max.java.heap.fraction of MR3.
hive.mr3.containergroup.scheme all-in-one ContainerGroup scheme: all-in-one, per-map-reduce, or per-vertex
hive.mr3.container.env   Environment string for ContainerGroups
hive.mr3.container.java.opts   Java options for ContainerGroups. This key takes precedence over MR3Conf.MR3_CONTAINER_LAUNCH_CMD_OPTS (mr3.container.launch.cmd-opts) in mr3-site.xml.
hive.mr3.container.combine.taskattempts true true: Allow multiple concurrent tasks in the same container.
false: Do not allow multiple concurrent tasks in the same container.
hive.mr3.container.reuse true true: Allow container reuse for running different tasks.
false: Do not allow container reuse.
hive.mr3.container.mix.taskattempts true true: Allow concurrent tasks from different DAGs in the same container.
false: Do not allow concurrent tasks from different DAGs in the same container.
hive.mr3.container.stop.cross.dag.reuse false true: Stop cross-DAG container reuse for ContainerGroups.
false: Continue cross-DAG container reuse for ContainerGroups.
hive.mr3.container.use.per.query.cache true Use per-query cache shared by all tasks in the same container (only for Hive 2 and 3).
hive.mr3.all-in-one.containergroup.memory.mb -1 Memory in MB allocated to each ContainerGroup under all-in-one scheme
hive.mr3.all-in-one.containergroup.vcores -1 Number of cores allocated to each ContainerGroup under all-in-one scheme
hive.mr3.map.containergroup.memory.mb -1 Memory in MB allocated to each mapper ContainerGroup under per-map-reduce or per-vertex scheme. If set to -1, Hive-MR3 reads MRJobConfig.MAP_MEMORY_MB.
hive.mr3.reduce.containergroup.memory.mb -1 Memory in MB allocated to each reducer ContainerGroup under per-map-reduce or per-vertex scheme. If set to -1, Hive-MR3 reads MRJobConfig.REDUCE_MEMORY_MB.
hive.mr3.map.containergroup.vcores -1 Number of cores allocated to each mapper ContainerGroup under per-map-reduce or per-vertex scheme. If set to -1, Hive-MR3 reads MRJobConfig.MAP_CPU_VCORES.
hive.mr3.reduce.containergroup.vcores -1 Number of cores allocated to each reducer ContainerGroup under per-map-reduce or per-vertex scheme. If set to -1, Hive-MR3 reads MRJobConfig.REDUCE_CPU_VCORES.
hive.mr3.exec.print.summary false true: display breakdown of execution steps for every query.
false: do not display.
hive.llap.io.enabled false true: use LLAP I/O.
false: do not use LLAP I/O.
hive.mr3.llap.headroom.mb 1024 Memory in MB allocated to the headroom for Java VM overhead when LLAP I/O is enabled
hive.mr3.llap.daemon.task.memory.mb 0 Memory in MB allocated to a DaemonTaskAttempt for LLAP I/O
hive.mr3.llap.daemon.task.vcores 0 Number of cores allocated to a DaemonTaskAttempt for LLAP I/O
hive.mr3.exec.inplace.progress true true: update execution progress in-place in the terminal.
false: do not update.
hive.mr3.use.daemon.shufflehandler false true: use the MR3 shuffle handler for non-local ContainerWorkers.
false: do not use the MR3 shuffle handler.
hive.server2.mr3.share.session false true: run HiveServer2 in shared session mode.
false: run HiveServer2 in individual session mode.
hive.mr3.mapjoin.interrupt.check.interval 100000L Interval (in terms of the number of entries) at which HashTableLoader checks the interrupt state
hive.mr3.bucket.mapjoin.estimate.num.containers 10 Estimate number of nodes for converting to bucket mapjoin. Should be set to the number of nodes in the cluster.
hive.mr3.dag.additional.credentials.source   Comma-separated list of additional paths for obtaining credentials. If a query has no input (e.g., when creating a fresh table), HDFS tokens may empty. In such a case, the user can provide additional paths for obtaining credentials so that the query can be executed with proper HDFS tokens. This configuration key is especially useful when running Hive on Kubernetes.
hive.mr3.localize.session.jars true true: localize hive-exec.jar as a local resource.
false: do not localize hive-exec.jar (for Hive on Kubernetes).
hive.mr3.am.task.max.failed.attempts 3 Maximum number of attempts for each Task
hive.mr3.zookeeper.appid.namespace mr3AppId ZooKeeper namespace for sharing Application ID