On Hadoop, Spark on MR3 can run in client mode in which the Spark driver is executed not in a Yarn container but in an ordinary process on the same node where Spark on MR3 is launched. (Spark on MR3 does not support cluster mode in which the Spark driver is executed in a Yarn container.)
Running the Spark driver
To run the Spark driver (in client mode),
the user can execute
which in turn executes the script
bin/spark-shell, respectively, under the directory of the Spark installation.
The script accepts the following options.
--local: Spark on MR3 reads configuration files under the directory
--cluster(default): Spark on MR3 reads configuration files under the directory
--tpcds: Spark on MR3 reads configuration files under the directory
--amprocess: DAGAppMaster runs in LocalProcess mode instead of Yarn mode. See DAGAppMaster and ContainerWorker Modes for more details on LocalProcess mode.
--conf: These options with arguments are passed on to
In order to connect to an existing DAGAppMaster running in Yarn mode (instead of creating a new one),
the user should set the configuration key
spark.mr3.appid to its application ID using the
$ spark/run-spark-shell.sh --conf spark.mr3.appid=application_1620113436187_0291
By default, terminating a Spark driver does not delete the running DAGAppMaster.
In order to kill DAGAppMaster automatically when a Spark driver terminates,
the user can set the configuration key
spark.mr3.keep.am to false using the
Note that setting
spark.mr3.keep.am to false is effective even when an existing DAGAppMaster is used.
$ spark/run-spark-shell.sh --conf spark.mr3.keep.am=false $ spark/run-spark-shell.sh --conf spark.mr3.keep.am=false --conf spark.mr3.appid=application_1620113436187_0291
Stopping Spark on MR3
If the configuration key
spark.mr3.keep.am is set to the default value of true,
the user can manually kill DAGAppMaster by executing the command