Integrating Apache Ranger

Hive on MR3 integrates with Ranger exactly in the same way that Hive on Tez does. Below we illustrate how to integrate Ranger into an installation of Hive on MR3 in a Kerberos-enabled secure cluster. In a non-secure cluster without Kerberos, the user can skip those steps related to Kerberos tickets. As a running example, we assume that a Ranger policy RED_hive is already active and that user hive starts HiveServer2. We have tested with Ranger 0.7, but the procedure should apply equally to other versions.

1. Extend HIVE_MYSQL_DRIVER in env.sh

Extend variable HIVE_MYSQL_DRIVER in env.sh to include the path to Ranger jar files, e.g.:

HIVE_MYSQL_DRIVER=/usr/share/java/mysql-connector-java.jar:/usr/hdp/2.6.4.0-91/ranger-hive-plugin/lib/ranger-hive-plugin-impl/*
2. Copy the configuration files for the Hive plugin of Ranger

Locate the following configuration files for the Hive plugin of Ranger (which are typically found under /etc/hive) and make sure that they are readable to user hive:

  • ranger-hive-audit.xml
  • ranger-hive-security.xml
  • ranger-policymgr-ssl.xml

Then either copy these files to a configuration directory, or create their links. For example, in order to run HiveServer2 with --mysql --hivesrc5, we could create links in the directory conf/mysql/hive5:

$ ln -s /etc/hive/2.6.4.0-91/0/conf.server/ranger-hive-audit.xml ranger-hive-audit.xml
$ ln -s /etc/hive/2.6.4.0-91/0/conf.server/ranger-hive-security.xml ranger-hive-security.xml
$ ln -s /etc/hive/2.6.4.0-91/0/conf.server/ranger-policymgr-ssl.xml ranger-policymgr-ssl.xml
3. Set the Kerberos principal and the keytab file

ranger-hive-audit.xml sets a configuration key xasecure.audit.jaas.Client.option.keyTab:

<property>
  <name>xasecure.audit.jaas.Client.option.keyTab</name>
  <value>/etc/security/keytabs/hive.service.keytab</value>
</property>

Retrieve the Kerberos principal (e.g., hive/red0@RED) from the keytab file, and update env.sh as follows:

HIVE_SERVER2_KERBEROS_PRINCIPAL=hive/red0@RED
HIVE_SERVER2_KERBEROS_KEYTAB=/etc/security/keytabs/hive.service.keytab
4. Check the directory containing the policy cache

ranger-hive-security.xml sets a configuration key ranger.plugin.hive.policy.cache.dir to a directory containing the policy cache:

<property>
  <name>ranger.plugin.hive.policy.cache.dir</name>
  <value>/etc/ranger/RED_hive/policycache</value>
</property>

Make sure that the directory is accessible to user hive. Depending on the setting of Ranger, HiveServer2 may read a few more files (e.g., /etc/ranger/RED_hive/cred.jceks). Make sure that they are also accessible to user hive.

5. Update hive-site.xml to use Ranger

Set configuration keys specific to Ranger in hive-site.xml. For example, in order to run HiveServer2 with --mysql --hivesrc5, we could add the following entries in conf/mysql/hive5/hive-site.xml:

<property>
<name>hive.server2.enable.doAs</name>
<value>false</value>
</property>

<property>
<name>hive.security.authorization.enabled</name>
<value>true</value>
</property>

<property>
<name>hive.security.authorization.manager</name>
<value>org.apache.ranger.authorization.hive.authorizer.RangerHiveAuthorizerFactory</value>
</property>

<property>
<name>hive.security.authenticator.manager</name>
<value>org.apache.hadoop.hive.ql.security.SessionStateUserAuthenticator</value>
</property>

<property>
<name>hive.conf.restricted.list</name>
<value>hive.security.authorization.enabled,hive.security.authorization.manager,hive.security.authenticator.manager</value>
</property>

Note that it is okay to set hive.server2.enable.doAs to true because enabling impersonation is orthogonal to using Ranger.

6. Run HiveServer2 and Beeline

Run HiveServer2 as user hive, e.g.:

[hive@red0 mr3-run]$ hive/hiveserver2-service.sh start --mysql --hivesrc5 --tezsrc3

Run Beeline using the Kerberos principal retrieved from the keytab file:

[hive@red0 mr3-run]$ hive/hivejar/apache-hive-3.0.0-bin/bin/beeline -u "jdbc:hive2://red0:9852/;principal=hive/red0@RED;"
Beeline version 3.0.0 by Apache Hive
0: jdbc:hive2://red0:9852/> show databases;
+---------------------------------+
|          database_name          |
+---------------------------------+
| default                         |
| tpcds_bin_partitioned_orc_1000  |
+---------------------------------+
[gitlab-runner@red0 mr3-run]$ hive/hivejar/apache-hive-3.0.0-bin/bin/beeline -u "jdbc:hive2://red0:9852/;principal=hive/red0@RED;"
Beeline version 3.0.0 by Apache Hive
0: jdbc:hive2://red0:9852/> show databases;
Error: Error while compiling statement: FAILED: HiveAccessControlException Permission denied: user [gitlab-runner] does not have [USE] privilege on [Unknown resource!!] (state=42000,code=40000)