Running Metastore

Hive can run only if Metastore is running. Any version of Metastore, not necessarily those included in the MR3 release, works okay with Hive on MR3. For example, if Metastore is already running, the user may reuse it without starting another instance of Metastore. In a multi-user environment, the administrator user (e.g., hive) typically starts Metastore.

In order to run Metastore included in the MR3 release, set the following environment variables in as necessary:







Note that specifies a Metastore address (host and port) for each version of Hive because of the incompatibility between different versions of Metastore.

  • HIVE1_METASTORE_LOCAL_PORT specifies the port for Metastore running in local mode (in which everything runs on a single machine) with --hivesrc1. If the user does not need Hive on MR3 in local mode, this environment variable may be ignored.
  • HIVE1_DATABASE_NAME specifies the database name for Metastore running with --hivesrc1.
  • HIVE1_HDFS_WAREHOUSE specifies the directory for the Hive warehouse on HDFS for Metastore running in non-local mode with --hivesrc1. For local mode, Hive on MR3 creates a Hive warehouse under the installation directory. Note that different versions of Metastore can share the same Hive warehouse, while their databases cannot be shared.
  • Similarly for --hivesrc2 and --hivesrc5.

  • HIVE_METASTORE_HEAPSIZE specifies the heap size (in megabytes) for Metastore.
  • HIVE_METASTORE_KERBEROS_PRINCIPAL and HIVE_METASTORE_KERBEROS_KEYTAB specify the principal and keytab file for Metastore, and correspond to configuration keys hive.metastore.kerberos.principal and hive.metastore.kerberos.keytab.file in hive-site.xml.
  • HIVE_MYSQL_DRIVER specifies the path to a MySQL connector jar file which is necessary when using a MySQL database. One can download the official JDBC driver for MySQL at

In order to start Metastore, execute hive/ with the following options:

start                     # Start Metastore on port defined in HIVE?_METASTORE_PORT.
stop                      # Stop Metastore on port defined in HIVE?_METASTORE_PORT.
restart                   # Restart Metastore on port defined in HIVE?_METASTORE_PORT.
--local                   # Run jobs with configurations in conf/local/.
--cluster                 # Run jobs with configurations in conf/cluster/ (default).
--mysql                   # Run jobs with configurations in conf/mysql/.
--tpcds                   # Run jobs with configurations in conf/tpcds/.
--hivesrc1                # Choose hive1-mr3 (based on Hive 1.2.2) (default).
--hivesrc2                # Choose hive2-mr3 (based on Hive 2.3.3).
--hivesrc5                # Choose hive5-mr3 (based on Hive 3.0.0).
--init-schema             # Initialize the database schema. 
--hiveconf <key>=<value>  # Add a configuration key/value.
<Metastore option>        # Add a Metastore option.
  • The user should use --init-schema to initialize the database schema when running Metastore for the first time. Otherwise the script may fail with the following error message in the log:
    MetaException(message:Version information not found in metastore. )

    Initializing the database schema is also necessary for enabling ACID transactions in Hive.

  • If the database becomes corrupt, the user should delete it manually before restarting Metastore. For a Derby database, the user can just delete the corresponding database directory as follows:
    rm -rf hive/hive-local-data/metastore/hive2mr3
    rm -rf hive/hive-local-data/metastore-cluster/hive2mr3

    For a MySQL database, the user should connect to the MySQL server and execute a command to delete it.

  • The user can append as many Metastore options (for the command hive --service metastore from Hive) as necessary to the command.

To see the type of the database used by Metastore, find the configuration key javax.jdo.option.ConnectionDriverName in hive-site.xml. For example, with --tpcds, Metastore uses a MySQL database:


If the configuration key javax.jdo.option.ConnectionDriverName is missing in hive-site.xml, Metastore uses a Derby database by default, as is the case when starting Metastore with either --local or --cluster. With --mysql and --tpcds, it uses a MySQL database.

In order to use a MySQL database, the user (who starts Metastore) should have access to the database with a user name and a password, which should be explicitly set in hive-site.xml:


Here are examples of starting Metastore for the first time:

hive/ start --local --hivesrc1
hive/ start --mysql --hivesrc2 --init-schema

By default, the log file for starting Metastore is written to /tmp/<user name>/hive.log. Below is an example of messages printed to the log file when Metastore starts successfully:

2018-03-12T14:52:24,611  INFO [main] metastore.HiveMetaStore: Started the new metaserver on port [9830]...
2018-03-12T14:52:24,611  INFO [main] metastore.HiveMetaStore: Options.minWorkerThreads = 200
2018-03-12T14:52:24,611  INFO [main] metastore.HiveMetaStore: Options.maxWorkerThreads = 1000
2018-03-12T14:52:24,611  INFO [main] metastore.HiveMetaStore: TCP keepalive = true

Note that all instances of Metastore started by the same user share the same log file.