Classpath resolution between spark uber jar and spark-submit --jars when similar classes exist in both

Tags:

apache-spark

What is the precedence in class loading when both the uber jar of my spark application and the contents of --jars option to my spark-submit shell command contain similar dependencies ?

I ask this from a third-party library integration standpoint. If I set --jars to use a third-party library at version 2.0 and the uber jar coming into this spark-submit script was assembled using version 2.1, which class is loaded at runtime ?

At present, I think of keeping my dependencies on hdfs, and add them to the --jars option on spark-submit, while hoping via some end-user documentation to ask users to set the scope of this third-party library to be 'provided' in their spark application's maven pom file.

698

asked Jul 01 '15 00:07

Sudarshan Thitte

1 Answers

This is somewhat controlled with params:

spark.driver.userClassPathFirst &
spark.executor.userClassPathFirst

If set to true (default is false), from docs:

(Experimental) Whether to give user-added jars precedence over Spark's own jars when loading classes in the the driver. This feature can be used to mitigate conflicts between Spark's dependencies and user dependencies. It is currently an experimental feature. This is used in cluster mode only.

I wrote some of the code that controls this, and there were a few bugs in the early releases, but if you're using a recent Spark release it should work (although it is still an experimental feature).

178

answered Oct 22 '22 01:10

Holden

Related questions
                            
                                How to connect master and slaves in Apache-Spark? (Standalone Mode)
                            
                                How to access a web URL using a spark context
                            
                                HDFS file watcher
                            
                                Spark: java.io.IOException: No space left on device
                            
                                How to use Spark SQL DataFrame with flatMap?
                            
                                How to sort an RDD and limit in Spark?
                            
                                pyspark: grouby and then get max value of each group
                            
                                Value for HADOOP_CONF_DIR from Cluster
                            
                                How to pass external parameters through Spark submit
                            
                                spark: How to do a dropDuplicates on a dataframe while keeping the highest timestamped row [duplicate]
                            
                                Randomly shuffle column in Spark RDD or dataframe
                            
                                Fill Pyspark dataframe column null values with average value from same column
                            
                                Spark with HBASE vs Spark with HDFS
                            
                                Creating Pyspark DataFrame column that coalesces two other Columns, why am I getting error of 'unicode' object has no attribute isNull?
                            
                                How spark handles object
                            
                                How to display a KeyValueGroupedDataset in Spark?
                            
                                How to continuously monitor a directory by using Spark Structured Streaming
                            
                                How to access an array element in dataframe column (scala) [duplicate]
                            
                                spark windowing function VS group by performance issue
                            
                                Operating RDD failed while setting Spark record delimiter with org.apache.hadoop.conf.Configuration

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With