With SSL Encryption
This page explains additional steps for using SSL (Secure Sockets Layer) encryption in Hive on MR3. For simplicity, secure connection to database servers for Metastore and Ranger is not enabled. See SSL Encryption for details.
basicsEnv: basics.T
We use secure connection to S3-compatible storage with HTTPS.
vi run.ts
const basicsEnv: basics.T = {
s3aEndpoint: "https://orange0:9000",
s3aEnableSsl: true,
The user should have a certificate for connecting to the storage.
metastoreEnv: metastore.T
We store the password of the MySQL server for Metastore
in a KeyStore file to be created later.
Internally the configuration key javax.jdo.option.ConnectionPassword
in hive-site.xml
is set to _
.
vi run.ts
const metastoreEnv: metastore.T = {
userName: "root",
password: "_",
hiveEnv: hive.T
We enable secure connection to public HiveServer2.
vi run.ts
const hiveEnv: hive.T = {
enableSsl: true,
Setting enableSsl
to true
does not enable secure connection to internal HiveServer2, Metastore, and Ranger,
which all run only inside the Kubernetes cluster.
To enable secure connection to these components as well
(which is usually unnecessary, e.g., because all these components run on the same node),
the user should update the source code.
vi server/api/hive.ts
export interface T {
...
enableSslInternal: true;
vi server/validate/hive.ts
export function initial(): T {
...
enableSslInternal: false
With the default Docker image for Superset, connecting securely to internal HiveServer2 does not work.
workerEnv: worker.T
We enable secure shuffle in MR3 using SSL mode. Then all the ContainerWorker Pods for Hive (but not for Spark) communicate securely.
vi run.ts
enableShuffleSsl: true
Enabling secure shuffle is usually unnecessary because ContainerWorker Pods are not reachable from the outside of the Kubernetes cluster. Beside it incurs a noticeable performance overhead.
secretEnv: secret.T
Create certificates and secrets by following the instructions in Creating certificates and secrets for SSL.
We set ssl
and shuffleSsl
fields
using the output files of generate-ssl.sh
and the password set in PASSWORD
.
vi run.ts
const secretEnv: secret.T = {
ssl: {
keystore: "hivemr3-ssl-certificate.jceks",
truststore: "hivemr3-ssl-certificate.jks",
password: "MySslPassword123",
keystoreData: fs.readFileSync("hivemr3-ssl-certificate.jceks").toString("base64"),
truststoreData: fs.readFileSync("hivemr3-ssl-certificate.jks").toString("base64")
},
shuffleSsl: {
keystore: "mr3-keystore.jks",
truststore: "mr3-truststore.jks",
keystoreData: fs.readFileSync("mr3-keystore.jks").toString("base64"),
truststoreData: fs.readFileSync("mr3-truststore.jks").toString("base64")
},
Configuring Ranger
In the Ranger service, fill the JDBC URL field with:
jdbc:hive2://hiveserver2-internal.hivemr3.svc.cluster.local:9852/;principal=hive/hiveserver2-internal.hivemr3.svc.cluster.local@PL;
Note that we use internal HiveServer2 which does not use secure connection by default.
Running queries
For sending queries to public HiveServer2, the user should use JDBC URL:
jdbc:hive2://orange1:9852/;principal=hive/orange1@PL;ssl=true;sslTrustStore=/path/to/beeline-ssl.jks;trustStorePassword=beelinepassword;