Running Metastore

Hive-MR3 can run only if Metastore is running. Any version of Metastore, not necessarily those included in a Hive-MR3 release, works okay with Hive-MR3. For example, if Metastore is already running, the user may reuse it without starting another instance of Metastore. In a multi-user environment, the administrator user (e.g., hive) typically starts Metastore.

In order to run Metastore included in the Hive-MR3 release, set the following environment variables in as necessary:








Note that specifies a Metastore address (host and port) for each version of Hive because of the incompatibility between different versions of Metastore.

  • HIVE1_METASTORE_LOCAL_PORT specifies the port for Metastore running in local mode (in which everything runs on a single machine) with --hivesrc1. If the user does not need Hive-MR3 in local mode, this environment variable may be ignored.
  • HIVE1_DATABASE_NAME specifies the database name for Metastore running with --hivesrc1.
  • HIVE1_HDFS_WAREHOUSE specifies the directory for the Hive warehouse on HDFS for Metastore running in non-local mode with --hivesrc1. For local mode, Hive-MR3 creates a Hive warehouse under the installation directory. Note that different versions of Metastore can share the same Hive warehouse, while their databases cannot be shared.
  • Similarly for --hivesrc2, --hivesrc3, and --hivesrc4.

  • HIVE_METASTORE_HEAPSIZE specifies the heap size (in megabytes) for Metastore.
  • HIVE_METASTORE_KERBEROS_PRINCIPAL and HIVE_METASTORE_KERBEROS_KEYTAB specify the principal and keytab file for Metastore, and correspond to configuration keys hive.metastore.kerberos.principal and hive.metastore.kerberos.keytab.file in hive-site.xml.
  • HIVE_MYSQL_DRIVER specifies the path to a MySQL connector jar file which is necessary when using a MySQL database. One can download the official JDBC driver for MySQL at

In order to start Metastore, execute hive/ with the following options:

start                     # Start Metastore on port defined in HIVE?_METASTORE_PORT.
stop                      # Stop Metastore on port defined in HIVE?_METASTORE_PORT.
restart                   # Restart Metastore on port defined in HIVE?_METASTORE_PORT.
--local                   # Run jobs with configurations in conf/local/.
--cluster                 # Run jobs with configurations in conf/cluster/ (default).
--mysql                   # Run jobs with configurations in conf/mysql/.
--tpcds                   # Run jobs with configurations in conf/tpcds/.
--hivesrc1                # Choose hive1-mr3 (based on Hive 1.2.2) (default).
--hivesrc2                # Choose hive2-mr3 (based on Hive 2.3.3).
--hivesrc3                # Choose hive3-mr3 (based on Hive 2.1.1).
--hivesrc3                # Choose hive3-mr3 (based on Hive 2.2.0).
--init-schema             # Initialize the database schema. 
--hiveconf <key>=<value>  # Add a configuration key/value.
<Metastore option>        # Add a Metastore option.
  • The user should use --init-schema to initialize the database schema when running Metastore for the first time. Otherwise the script may fail with the following error message in the log:
    MetaException(message:Version information not found in metastore. )

    Initializing the database schema is also necessary for enabling ACID transactions in Hive-MR3.

  • If the database becomes corrupt, the user should delete it manually before restarting Metastore. For a Derby database, the user can just delete the corresponding database directory as follows:
    rm -rf hive/hive-local-data/metastore/hive2mr3
    rm -rf hive/hive-local-data/metastore-cluster/hive3mr3

    For a MySQL database, the user should connect to the MySQL server and execute a command to delete it.

  • The user can append as many Metastore options (for the command hive --service metastore from Hive) as necessary to the command.

To see the type of the database used by Metastore, find the configuration key javax.jdo.option.ConnectionDriverName in hive-site.xml. For example, with --tpcds, Metastore uses a MySQL database:


If the configuration key javax.jdo.option.ConnectionDriverName is missing in hive-site.xml, Metastore uses a Derby database by default, as is the case when starting Metastore with either --local or --cluster. With --mysql and --tpcds, it uses a MySQL database.

In order to use a MySQL database, the user (who starts Metastore) should have access to the database with a user name and a password, which should be explicitly set in hive-site.xml:


Here are examples of starting Metastore for the first time:

hive/ --local --hivesrc1
hive/ --mysql --hivesrc2 --init-schema
hive/ --tpcds --hivesrc3 --init-schema

By default, the log file for starting Metastore is written to /tmp/<user name>/hive.log. Below is an example of messages printed to the log file when Metastore starts successfully:

2018-03-12T14:52:24,611  INFO [main] metastore.HiveMetaStore: Started the new metaserver on port [9830]...
2018-03-12T14:52:24,611  INFO [main] metastore.HiveMetaStore: Options.minWorkerThreads = 200
2018-03-12T14:52:24,611  INFO [main] metastore.HiveMetaStore: Options.maxWorkerThreads = 1000
2018-03-12T14:52:24,611  INFO [main] metastore.HiveMetaStore: TCP keepalive = true

Note that all instances of Metastore started by the same user share the same log file.