Running Beeline

While the user may use any client program to connect to HiveServer2, the MR3 release provides a script for creating Beeline connections based on the configuration files in the installation.

Running Beeline on the same node where HiveServer2 runs does not require any change to env.sh. In order to run Beeline on a different node, however, the user should install Hive on MR3 on the new node and set the following environment variables in env.sh:

# set JAVA_HOME if not set yet 
export JAVA_HOME=/usr/apps/java/default
export PATH=$JAVA_HOME/bin:$PATH

HIVE1_SERVER2_HOST=red0
HIVE1_SERVER2_PORT=9812

HIVE2_SERVER2_HOST=red0
HIVE2_SERVER2_PORT=9822

HIVE3_SERVER2_HOST=red0
HIVE3_SERVER2_PORT=9832

HIVE4_SERVER2_HOST=red0
HIVE4_SERVER2_PORT=9842

HIVE_SERVER2_AUTHENTICATION=KERBEROS
HIVE_SERVER2_KERBEROS_PRINCIPAL=hive/red0@RED

HIVE_CLIENT_HEAPSIZE=2048

LOG_LEVEL=INFO
  • HIVE1_SERVER2_HOST and HIVE1_SERVER2_PORT specify the address of HiveServer2 based on Hive 1.2.2. Note that $HOSTNAME should not be used for HIVE1_SERVER2_HOST if Beeline is running on a different node. Similarly for HIVE2_SERVER2_HOST, HIVE3_SERVER2_HOST, and HIVE4_SERVER2_HOST.
  • HIVE_SERVER2_AUTHENTICATION specifies the authentication option for HiveServer2: NONE, NOSASL, KERBEROS, LDAP, PAM, and CUSTOM.
  • HIVE_SERVER2_KERBEROS_PRINCIPAL specifies the principal for HiveServer2 when HIVE_SERVER2_AUTHENTICATION is set to KERBEROS. Note that HIVE_SERVER2_KERBEROS_KEYTAB for the keytab file for HiveServer2 is not used for running Beeline.
  • HIVE_CLIENT_HEAPSIZE specifies the heap size (in megabytes) for Beeline.
  • LOG_LEVEL specifies the logging level.

In order to start a Beeline connection, execute hive/run-beeline.sh with the following options:

--local                   # Run jobs with configurations in conf/local/ (default).
--cluster                 # Run jobs with configurations in conf/cluster/.
--tpcds                   # Run jobs with configurations in conf/tpcds/.
--hivesrc1                # Choose hive1-mr3 (based on Hive 1.2.2).
--hivesrc2                # Choose hive2-mr3 (based on Hive 2.3.5).
--hivesrc3                # Choose hive3-mr3 (based on Hive 3.1.1) (default).
--hivesrc4                # Choose hive4-mr3 (based on Hive 4.0.0-SNAPSHOT).
--hiveconf <key>=<value>  # Add a configuration key/value; may be repeated at the end.
<Beeline option>          # Add a Beeline option; may be repeated at the end.

The user can append as many Beeline options (for the command beeline from Hive) as necessary to the command.

In a secure cluster with Kerberos, Beeline uses the Kerberos ticket provided by the user in order to authenticate itself to HiveServer2. Hence the Kerberos ticket should be valid at the time of executing the script. In a non-secure cluster without Kerberos, the script reads the environment variable USER for both the user name and the password. In order to override them, the user can supply Beeline options, as in hive/run-beeline.sh -n username_foo -p password_bar.

In a secure cluster with Kerberos, Beeline may fail with org.ietf.jgss.GSSException even if a valid Kerberos ticket is available:

javax.security.sasl.SaslException: GSS initiate failed
...
Caused by: org.ietf.jgss.GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
  at sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:147) ~[?:1.8.0_112]
  at sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:122) ~[?:1.8.0_112]
  at sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:187) ~[?:1.8.0_112]
  at sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:224) ~[?:1.8.0_112]
  at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212) ~[?:1.8.0_112]
  at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179) ~[?:1.8.0_112]
  at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:192) ~[?:1.8.0_112]

In such a case, setting the Java property javax.security.auth.useSubjectCredsOnly to false may work. For example, the user can execute the following line before running Beeline:

export HADOOP_OPTS="$HADOOP_OPTS -Djavax.security.auth.useSubjectCredsOnly=false"

In order to show progress bars in Beeline output, update hive-site.xml as follows:

  • hive.server2.logging.operation.enabled should be set to true.
  • hive.server2.logging.operation.log.location should be set to a path for which the user has write permission.
  • In Hive 2.3.5 on MR3, hive.async.log.enabled should be set to false.
  • In Hive 3.1.1 on MR3, hive.async.log.enabled should be set to true.

beeline.progress

Hive 1.2.2 on MR3 does not support progress bars in Beeline.