Configuring MR3

The behavior of MR3 is specified by the configuration file `mr3-site.xml` in the classpath. Below we describe all configuration keys for MR3 which are divided into 8 sections: * MR3Runtime: configuration keys relevant to all components (MR3Client, DAGAppMaster, ContainerWorker) * MR3Client: configuration keys that are consumed or set by MR3Client * DAGAppMaster: configuration keys that are consumed or set by DAGAppMaster * ContainerGroup: configuration keys that specify properties of ContainerGroups * ContainerWorker: configuration keys for ContainerWorkers * TokenRenewer: configuration keys related to Kerberos and token renewal * HistoryLogger: configuration keys for history logging * tez.common.counters.Limits: configuration keys for Tez counters (which MR3 borrows from Tez) #### MR3Runtime |**Name**|**Default value**|Description| |--------|:----------------|:----------| |mr3.runtime|tez|**tez**: use Tez 0.7.0 runtime.
**tez2**: use Tez 0.8.4 runtime.
**tez3**: use Tez 0.9.1 runtime.
**asm**: use ASM runtime.| |mr3.master.mode|yarn|**local-thread**: DAGAppMaster starts as a new thread inside MR3Client.
**local-process**: DAGAppMaster starts as a new process on the same machine where MR3Client is running.
**yarn**: DAGAppMaster starts as a new container in the Hadoop cluster.| |mr3.am.acls.enabled|true|**true**: enable ACLs for DAGAppMaster and DAGs.
**false**: disable ACLS for DAGAppMaster and DAGs.| |mr3.cluster.additional.classpath||Additional classpath for DAGAppMaster and ContainerWorkers| |mr3.cluster.use.hadoop-libs|false|**true**: include the classpath defined in YarnConfiguration.YARN\_APPLICATION\_CLASSPATH.
**false**: do not include the classpath defined in YarnConfiguration.YARN\_APPLICATION\_CLASSPATH.| |mr3.container.max.java.heap.fraction|0.8|Fraction of memory to be allocated for Java heap in DAGAppMaster and ContainerWorkers| #### MR3Client |**Name**|**Default value**|Description| |--------|:----------------|:----------| |mr3.lib.uris|\${liburis}|URIs for the MR3 library jar files| |mr3.aux.uris|\${auxuris}|URIs for the MR3 auxiliary jar files| |mr3.queue.name||Name of the Yarn queue to which the MR3 job is submitted| |mr3.credentials.path||Path to the credentials for MR3| |mr3.am.staging-dir|/tmp/\${user.name}/staging|Staging directory for DAGAppMaster| |mr3.am.resource.memory.mb|5120|Size of memory in MB for DAGAppMaster| |mr3.am.resource.cpu.cores|1|Number of cores for DAGAppMaster| |mr3.am.launch.cmd-opts|-server -Djava.net.preferIPv4Stack=true
-Dhadoop.metrics.log.level=WARN
-XX:+PrintGCDetails -verbose:gc
-XX:+PrintGCTimeStamps
-XX:+UseNUMA
-XX:+UseParallelGC|Command-line options for launching DAGAppMaster| |mr3.am.launch.env|LD\_LIBRARY\_PATH=\$LD\_LIBRARY\_PATH:
\$HADOOP\_COMMON\_HOME/lib/native/|Environment variables for launching DAGAppMaster| |mr3.am.max.app.attempts|2|Max number of Yarn ApplicationAttempts for the MR3 job| |mr3.am.log.level|INFO|Logging level for DAGAppMaster| |mr3.am.local.working-dir|"/tmp/\${user.name}/mr3/working-dir"|Local working directory for DAGAppMaster| |mr3.am.local.log-dir|"/tmp/\${user.name}/mr3/log-dir"|Logging directory for DAGAppMaster| |mr3.cancel.delegation.tokens.on.completion|true|**true**: cancel delegation tokens when the MR3 job completes.
**false**: do not cancel delegation tokens.| |mr3.dag.status.pollinterval.ms|1000|Time interval in milliseconds for retrieving the status of running DAGs| |mr3.am.session.mode|false|**true**: create MR3 SessionClient.
**false**: create MR3 JobClient.| |mr3.session.client.timeout.secs|120|Time in seconds for terminating MR3 SessionClient with a timeout| #### DAGAppMaster |**Name**|**Default value**|Description| |--------|:----------------|:----------| |mr3.am.local.worker.mode|false|**true**: ContainerWorkers start as threads inside DAGAppMaster.
**false**: ContainerWorkers start as containers in the Hadoop cluster.| |mr3.am.max.num.concurrent.dags|128|Max number of DAGs that can run concurrently in DAGAppMaster| |mr3.am.shutdown.rightaway|true|**true**: DAGAppMaster does not wait until MR3Client retrieves the final states of all DAGs.
**false**: DAGAppMaster waits until MR3Client retrieves the final states of all DAGs.| |mr3.am.shutdown.sleep.max.ms|5000|Time in milliseconds to wait until MR3Client retrieves the final states of all DAGs| |mr3.am.local.resourcescheduler.min.memory.mb|256|Min size of memory in MB to reserve for all local ContainerWorkers running in DAGAppMaster| |mr3.am.local.resourcescheduler.max.memory.mb|4096|Max size of memory in MB to reserve for all local ContainerWorkers running in DAGAppMaster| |mr3.am.local.resourcescheduler.max.cpu.cores|1|Max number of cores for all local ContainerWorkers running in DAGAppMaster| |mr3.am.local.resourcescheduler.native.fraction|0.0d|Fraction of memory to be allocated for native memory for all local ContainerWorkers running in DAGAppMaster| |mr3.am.delete.local.working-dir|true|**true**: DAGAppMaster running in LocalThread or LocalProcess mode deletes its local working directory upon termination.
**false**: DAGAppMaster running in LocalThread or LocalProcess mode does not delete its local working directory upon termination.| |mr3.am.taskcommunicator.type|protobuf|**protobuf**: use Protobuf for communication between TaskCommunicator and ContainerWorkers.
**protowritable**: use Protobuf + Writable for communication between TaskCommunicator and ContainerWorkers.
**writable**: use Writable for communication between TaskCommunicator and ContainerWorkers.
**direct**: use direct communication between TaskCommunicator and local ContainerWorkers.| |mr3.am.taskcommunicator.thread.count|30|Number of threads in TaskCommunicator for serving requests from ContainerWorkers| |mr3.am.resourcescheduler.max.requests.per.taskscheduler|10|Max number of containers that TaskScheduler can request to Yarn ResourceScheduler at once| |mr3.am.rm.heartbeat.interval.ms|1000|Time interval in milliseconds of heartbeats in AMRMClientAsync| |mr3.dag.priority.scheme|fifo|**fifo**: assign DAG priorities on FIFO basis.
**concurrent**: assign the same priority to all DAGs.| |mr3.am.task.max.failed.attempts|3|Max number of TaskAttempts to create for Task| |mr3.am.commit-all-outputs-on-dag-success|true|**true**: commit the output of all Vextexes when DAG completes successfully.
**false**: commit the output when Vertex completes successfully.| |mr3.am.client.thread-count|1|Number of threads in DAGClientServer for serving requests from MR3Clients| |mr3.heartbeat.timeout.ms|300000|Time in milliseconds for triggering heartbeat timeout for TaskAttempts and ContainerWorkers| |mr3.task.heartbeat.timeout.check.ms|30000|Time interval in milliseconds for checking heartbeat timeout for TaskAttempts| |mr3.container.heartbeat.timeout.check.ms|15000|Time interval in milliseconds for checking heartbeat timeout for ContainerWorkers| |mr3.container.idle.timeout.ms|300000|Time in milliseconds for triggering timeout for idle ContainerWorkers| |mr3.am.node-blacklisting.enabled|true|**true**: enable node blacklisting.
**false**: disable node blacklisting.| |mr3.am.maxtaskfailure.percent|5|Percentage of TaskAttempt failures that puts a node in an unusable state| |mr3.dag.recovery.enabled|true|**true**: enable DAG recovery when DAGAppMaster restarts.
**false**: disable DAG recovery when DAGAppMaster restarts.| |mr3.am.max.finished.reported.dags|100|Max number of DAGs whose final states are kept in DAGAppMaster after being reported to MR3Client| |mr3.am.generate.dag.graph.viz|false|**true**: create DOT graph files showing the structure of DAGs on the working directory of DAGAppMaster.
**false**: do not create DOT graph files.| #### ContainerGroup |**Name**|**Default value**|Description| |--------|:----------------|:----------| |mr3.container.stop.cross.dag.reuse|true|**true**: stop cross-DAG container reuse for the current ContainerGroup.
**false**: do not update the current ContainerGroup with regard to cross-DAG container reuse.| |mr3.container.reuse|false|**true**: reuse ContainerWorkers in the current ContainerGroup.
**false**: use each ContainerWorker only for a single TaskAttempt.| |mr3.container.resourcescheduler.type|local|**local**: create local ContainerWorkers in DAGAppMaster for the current ContainerGroup.
**yarn**: create Yarn ContainerWorkers for the current ContainerGroup.| |mr3.container.combine.taskattempts|false|**true**: allow multiple TaskAttempts to run concurrently in a ContainerWorker.
**false**: allow only one TaskAttempt to run at a time in a ContainerWorker.| |mr3.container.mix.taskattempts|true|**true**: allow TaskAttempts from different DAGs to run concurrently in a ContainerWorker.
**false**: use each ContainerWorker for a single DAG exclusively.| |mr3.container.launch.cmd-opts|-server -Djava.net.preferIPv4Stack=true
-Dhadoop.metrics.log.level=WARN
-XX:+PrintGCDetails -verbose:gc
-XX:+PrintGCTimeStamps
-XX:+UseNUMA
-XX:+UseParallelGC|Command-line options for launching ContainerWorkers| |mr3.container.launch.env|LD\_LIBRARY\_PATH=\$LD\_LIBRARY\_PATH:
\$HADOOP\_COMMON\_HOME/lib/native/|Environment variables for launching ContainerWorkers| |mr3.container.log.level|INFO|Logging level for ContainerWorkers| |mr3.container.kill.policy|container.kill.wait.workervertex|**container.kill.wait.workervertex**: stop a ContainerWorker only if no more TaskAttempts are to arrive.
**container.kill.nowait**: stop a ContainerWorker right away if it is not serving any TaskAttempt.| #### ContainerWorker |**Name**|**Default value**|Description| |--------|:----------------|:----------| |mr3.container.get.command.interval.ms|2000|Time interval in milliseconds for retrieving commands in ContainerWorkers that are currently serving TaskAttempts| |mr3.container.busy.wait.interval.ms|100|Time interval in milliseconds for retrieving commands in idle ContainerWorkers| |mr3.task.am.heartbeat.interval.ms|250|Time interval in milliseconds for sending heartbeats from TaskAttempts| |mr3.task.am.heartbeat.counter.interval.ms|10000|Time interval in milliseconds for sending counters in heartbeats from TaskAttempts| |mr3.task.max.events.per.heartbeat|500|Max number of Task events to include in a heartbeat reply| |mr3.container.thread.keep.alive.time.ms|4000|Time in milliseconds for keeping threads serving TaskAttempts in ContainerWorkers| #### TokenRenewer |**Name**|**Default value**|Description| |--------|:----------------|:----------| |mr3.principal||Kerberos principal| |mr3.keytab||Location of the Kerberos keytab file| |mr3.token.renewal.fraction|0.75|Fraction of the token renewal interval for renewing tokens conservatively| |mr3.token.renewal.retry.interval.ms|10000|Time interval in milliseconds for retrying token renewal| |mr3.token.renewal.num.credentials.files|5|Max number of credential files to keep for token renewal| |mr3.token.renewal.hdfs.enabled|false|**true**: automatically renew HDFS tokens.
**false**: do not renew HDFS tokens.| |mr3.token.renewal.hive.enabled|false|**true**: automatically renew Hive tokens.
**false**: do not renew Hive tokens.| #### HistoryLogger |**Name**|**Default value**|Description| |--------|:----------------|:----------| |mr3.app.history.logging.enabled|false|**true**: enable history logging for Yarn applications and ContainerWorkers.
**false**: disable history logging for Yarn applications and ContainerWorkers.| |mr3.dag.history.logging.enabled|false|**true**: enable history logging for DAGs.
**false**: disable history logging for DAGs.| |mr3.task.history.logging.enabled|false|**true**: enable history logging for Tasks.
**false**: disable history logging for Tasks.| #### tez.common.counters.Limits |**Name**|**Default value**|Description| |--------|:----------------|:----------| |tez.counters.max|1200|Max number of Tez counters| |tez.counters.max.groups|500|Max number of Tez counters groups| |tez.counters.counter-name.max-length|64|Max length of Tez counter names| |tez.counters.group-name.max-length|256|Max length of Tez counters group names|