On Hadoop, Spark on MR3 can run in client mode in which a Spark driver is executed not in a Yarn container but in an ordinary process on the same node where Spark on MR3 is launched. It does not support cluster mode in which a Spark driver is executed in a Yarn container.
Running a Spark driver
To run a Spark driver (in client mode),
set environment variables
SPARK_DRIVER_MEMORY_MB to specify the arguments for the configuration keys
spark.driver.memory (in MB), respectively, e.g.:
$ export SPARK_DRIVER_CORES=2 $ export SPARK_DRIVER_MEMORY_MB=13107
Then the user can execute
which in turn executes the script
bin/spark-shell, respectively, under the directory of the Spark installation.
The scripts accept the following options.
--local: Spark on MR3 reads configuration files under the directory
--cluster(default): Spark on MR3 reads configuration files under the directory
--tpcds: Spark on MR3 reads configuration files under the directory
--amprocess: DAGAppMaster runs in LocalProcess mode instead of Yarn mode. See DAGAppMaster and ContainerWorker Modes for more details on LocalProcess mode.
--conf: These options with arguments are passed on to
The user should use a
--conf spark.driver.host option to specify
the host name or address where the Spark driver runs.
In order to run multiple Spark drivers on the same node,
the user should also specify a unique port for each individual driver
--conf spark.driver.port option.
In order to connect to an existing DAGAppMaster running in Yarn mode (instead of creating a new one),
the user should set the configuration key
spark.mr3.appid to its application ID using a
--conf option, e.g.:
$ spark/run-spark-shell.sh --conf spark.driver.host=gold0 --conf spark.mr3.appid=application_1620113436187_0291
By default, terminating a Spark driver does not delete the running DAGAppMaster.
In order to kill DAGAppMaster automatically when a Spark driver terminates,
the user can set the configuration key
spark.mr3.keep.am to false using the
Note that setting
spark.mr3.keep.am to false is effective even when an existing DAGAppMaster is used.
$ spark/run-spark-shell.sh --conf spark.driver.host=gold0 --conf spark.mr3.keep.am=false $ spark/run-spark-shell.sh --conf spark.driver.host=gold0 --conf spark.mr3.keep.am=false --conf spark.mr3.appid=application_1620113436187_0291
Stopping Spark on MR3
If the configuration key
spark.mr3.keep.am is set to the default value of true,
the user can manually kill DAGAppMaster by executing the command