To install Spark on MR3 on Hadoop,
first set up Spark on MR3.
Then set the following environment variables in env.sh
.
$ vi env.sh
export HADOOP_HOME=${HADOOP_HOME:-/usr/lib/hadoop}
HDFS_LIB_DIR=/user/$USER/lib
HADOOP_HOME_LOCAL=$HADOOP_HOME
HADOOP_NATIVE_LIB=$HADOOP_HOME_LOCAL/lib/native
SECURE_MODE=false
USER_PRINCIPAL=spark@HADOOP
USER_KEYTAB=/home/spark/spark.keytab
MR3_TEZ_ENABLED=false
MR3_SPARK_ENABLED=true
HDFS_LIB_DIR
specifies the directory on HDFS to which MR3 jar files are uploaded. Hence it is only for non-local mode.HADOOP_HOME_LOCAL
specifies the directory for the Hadoop installation to use in local mode in which everything runs on a single machine and does not require Yarn.SECURE_MODE
specifies whether the cluster is secure with Kerberos or not.USER_PRINCIPAL
andUSER_KEYTAB
specify the principal and keytab file for the user executing Spark.MR3_TEZ_ENABLED
andMR3_SPARK_ENABLED
specify which internal runtime (Tez or Spark) to use in MR3.
Then the user should copy all the jar files (of MR3, Spark-MR3, and Spark) to HDFS.
$ mr3/upload-hdfslib-mr3.sh
$ spark/upload-hdfslib-spark.sh