The user may use any client program to connect to HiveServer2,
including the script hive/run-beeline.sh
in the MR3 release (see Running Beeline).
The MR3 release provides another script kubernetes/hive/hive/run-beeline.sh
which slightly simplifies the process of configuring Beeline.
Using run-beeline.sh
In order to use kubernetes/hive/hive/run-beeline.sh
,
first
install Hive on MR3 on the node where Beeline is to be run,
and populate the directory kubernetes/hive/
by executing build-k8s.sh
as explained in Installing on Kubernetes.
(For the purpose of running Beeline only, there is no need to build a Docker image.)
Then copy the directory kubernetes/conf
to kubernetes/hive/
:
$ cp -r kubernetes/conf kubernetes/hive/
Finally create a new file kubernetes/hive/env.sh
(which is read by kubernetes/hive/hive/run-beeline.sh
) and
set the following environment variables:
$ kubernetes/hive/env.sh
# set JAVA_HOME if not set yet
export JAVA_HOME=/usr/apps/java/default
export PATH=$JAVA_HOME/bin:$PATH
HIVE_SERVER2_HOST=10.1.91.41
HIVE_SERVER2_PORT=9852
# HIVE_SERVER2_JDBC_OPTS="ssl=true;sslTrustStore=/home/hive/mr3-run/kubernetes/beeline-ssl.jks;trustStorePassword=beelinepasswd1;transportMode=http;httpPath=/cliservice"
HIVE_SERVER2_AUTHENTICATION=KERBEROS
HIVE_SERVER2_KERBEROS_PRINCIPAL=hive/indigo20@RED
HIVE_CLIENT_HEAPSIZE=2048
LOG_LEVEL=INFO
HIVE_SERVER2_HOST
andHIVE_SERVER2_PORT
specify the address of HiveServer2.- An optional environment variable
HIVE_SERVER2_JDBC_OPTS
specifies a string to be appended to the JDBC connection string.- If SSL is enabled, it should contain, e.g.,
ssl=true;sslTrustStore=/home/hive/mr3-run/kubernetes/beeline-ssl.jks;trustStorePassword=beelinepasswd1
. - If HTTP transport is used, it should contain
transportMode=http;httpPath=/cliservice
.
- If SSL is enabled, it should contain, e.g.,
HIVE_SERVER2_AUTHENTICATION
specifies the authentication option for HiveServer2: NONE, NOSASL, KERBEROS, LDAP, PAM, and CUSTOM.HIVE_SERVER2_KERBEROS_PRINCIPAL
specifies the principal for HiveServer2 whenHIVE_SERVER2_AUTHENTICATION
is set to KERBEROS. Note thatHIVE_SERVER2_KERBEROS_KEYTAB
for the keytab file for HiveServer2 is not used for running Beeline.HIVE_CLIENT_HEAPSIZE
specifies the heap size (in MB) for Beeline.LOG_LEVEL
specifies the logging level.
In order to start a Beeline connection, execute kubernetes/hive/hive/run-beeline.sh
with the following options:
--hiveconf <key>=<value> # Add a configuration key/value; may be repeated at the end.
<Beeline option> # Add a Beeline option; may be repeated at the end.
The user can also directly execute Beeline after setting environment variables JAVA_HOME
and HADOOP_HOME
.
Here is an example.
$ export JAVA_HOME=/usr/lib/jdk1.8.0_231/
$ export HADOOP_HOME=/home/hive/hadoop-3.3.1
$ kubernetes/hive/hive/apache-hive/bin/beeline -u "jdbc:hive2://orange1:9852/;;;;" -u hive -p hive
$ kubernetes/hive/hive/apache-hive/bin/beeline -u "jdbc:hive2://orange1:10001/;;;transportMode=http;httpPath=/cliservice" -u hive -p hive
$ kubernetes/hive/hive/apache-hive/bin/beeline -u "jdbc:hive2://orange1:9852/;;;ssl=true;sslTrustStore=/home/hive/mr3-run/kubernetes/beeline-ssl.jks;trustStorePassword=beelinepasswd1" -u hive -p hive
$ kubernetes/hive/hive/apache-hive/bin/beeline -u "jdbc:hive2://orange1:10001/;;;ssl=true;sslTrustStore=/home/hive/mr3-run/kubernetes/beeline-ssl.jks;trustStorePassword=beelinepasswd1;transportMode=http;httpPath=/cliservice" -u hive -p hive
Using Kerberos
If HiveServer2 uses Kerberos, Beeline uses the Kerberos ticket provided by the user in order to authenticate itself to HiveServer2.
Hence the Kerberos ticket should be valid at the time of executing the script.
If HiveServer2 does not use Kerberos, the script reads the environment variable USER
for both the user name and the password.
In order to override them, the user can supply Beeline options, as in kubernetes/hive/hive/run-beeline.sh -n username_foo -p password_bar
.
When HiveServer2 uses Kerberos, Beeline may fail with org.ietf.jgss.GSSException
even if a valid Kerberos ticket is available:
javax.security.sasl.SaslException: GSS initiate failed
...
Caused by: org.ietf.jgss.GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
at sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:147) ~[?:1.8.0_112]
at sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:122) ~[?:1.8.0_112]
at sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:187) ~[?:1.8.0_112]
at sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:224) ~[?:1.8.0_112]
at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212) ~[?:1.8.0_112]
at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179) ~[?:1.8.0_112]
at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:192) ~[?:1.8.0_112]
In such a case, setting the Java property javax.security.auth.useSubjectCredsOnly
to false may work.
For example, the user can execute the following line before running Beeline:
$ export HADOOP_OPTS="$HADOOP_OPTS -Djavax.security.auth.useSubjectCredsOnly=false"
Displaying progress bars
In order to display progress bars in Beeline output, update hive-site.xml
as follows:
hive.server2.logging.operation.enabled
should be set to true.hive.server2.logging.operation.log.location
should be set to a path for which the user has write permission.- In Hive 3 on MR3,
hive.async.log.enabled
should be set to true.