Frequently Asked Questions

Q. What is MR3?
A. It is a new execution engine for Hadoop and Kubernetes, similar in spirit to MapReduce and Tez.

Q. What applications of MR3 are available?
A. Currently Hive on MR3 is its main application.

Q. Who is the intended audience of Hive on MR3?
A. Anyone running Hive, Impala, Presto, or SparkSQL on Hadoop, and anyone planning to run SQL-based analytics on Kubernetes.

Q. I am currently using Hive-LLAP. Why should I use Hive on MR3?
A. Because it is more performant in concurrent environments and easier to operate. For the benefit of switching to Hive on MR3, please see the page on Comparison with Hive-LLAP.

Q. I am currently using Impala on Hadoop. Why should I use Hive on MR3?
A. Because it is faster on average and more mature. Besides MR3 uses ephemeral Yarn containers and thus allows the user to make better use of cluster resources.

Q. I am currently using Presto/SparkSQL on Hadoop. Why should I use Hive on MR3?
A. Because it is much faster. Since Presto/SparkSQL runs with Hive Metastore, it is pretty easy to migrate to Hive on MR3.

Q. Do you have experimental results comparing Hive on MR3 against Hive-LLAP, Impala, Presto, and SparkSQL?
A. Yes, we have evaluated these systems using the TPC-DS benchmark on four separate production-grade clusters. For the latest performance evaluation result, please see our Blog.

Q. How can I test Hive on MR3 with minimum effort?
A. The user can test Hive on MR3 in local mode (in which everything runs on a single machine) with a Derby database for Metastore. On a Hadoop cluster where Metastore is already running, the user can quickly test Hive on MR3 by using preset configuration files included in the MR3 release. On Kubernetes, the user can quickly test Hive on MR3 using Helm and a pre-built Docker image. For more details, please see Quick Start Guide. In any case, no change to the underlying system is necessary.

Q. I would like to try Hive on MR3. Where should I go for help?
A. You can ask questions in MR3 Google Group.

Q. What is the benefit of running Hive on Kubernetes with MR3 as the execution engine?
A. The user can achieve the separation of compute and storage.

Q. Is MR3 fault-tolerant?
A. Yes. In fact, MR3 provides better support for fault tolerance than Tez and Spark. For more details, please see the page on Fault Tolerance.

Q. Does MR3 support Kerberos-based security?
A. Yes.

Q. Does Hive on MR3 support high availability?
A. Yes. Moreover multiple instances of HiveServer2 can share common Yarn containers or Kubernetes Pods. For more details, please see the page on High Availability.

Q. Does MR3 support auto-scaling on Kubernetes?
A. Not yet.

Q. Does Hive on MR3 integrate with Ranger?
A. Yes. Please see the page on Integrating Apache Ranger.

Q. Is MR3 an open source project?
A. No, it is currently a private project.

Q. Is MR3 free to use?
A. Yes. For more details, please see the page on License.