Frage

I'd like, from where I run a Spark job, to get the unique id of that job.

Via the Spark master node website, I can see that id. It's something like:

ID: app-20140429125304-0452

Is there any way to get this, when creating a running a job? Maybe via the SparkContext?

War es hilfreich?

Lösung

yes, exactly as you said:

sc.applicationId
res0: String = app-20150224184813-11531

(This is spark 1.2)

See API doc at

Andere Tipps

For those using pyspark, see this nearly identical question: How to extract application ID from the PySpark context

The answer from @vvladymyrov worked for me running pyspark in yarn-client mode.

>>> sc._jsc.sc().applicationId()
u'application_1433865536131_34483'

With the introduction of the spark: org.apache.spark.sql.SparkSession from Spark 2.0+ on use

scala> spark.sparkContext.applicationId
res1: String = app-20170228091742-0025

It depends on which language you are using.

Scala

https://spark.apache.org/docs/1.6.1/api/scala/index.html#org.apache.spark.SparkContext

sc.applicationId

Java

https://spark.apache.org/docs/1.6.2/api/java/org/apache/spark/api/java/JavaSparkContext.html

sparkContext.sc().applicationId();

Python

http://spark.apache.org/docs/1.6.2/api/python/pyspark.html#pyspark.SparkContext

sc.applicationId

It can also depend on Spark version.

Lizenziert unter: CC-BY-SA mit Zuschreibung
Nicht verbunden mit StackOverflow
scroll top