I'm trying to run small spark application and am getting the following exception:
Exception in thread "main" java.lang.IllegalAccessError: tried to access method com.google.common.base.Stopwatch.<init>()V from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:262)
at org.apache.hadoop.mapreduce.lib.input.CombineFileInputFormat.getSplits(CombineFileInputFormat.java:217)
at org.apache.spark.rdd.NewHadoopRDD.getPartitions(NewHadoopRDD.scala:95)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:217)
at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217)
at scala.Option.getOrElse(Option.scala:120)
the relevant gradle dependencies section:
compile('org.apache.spark:spark-core_2.10:1.3.1')
compile('org.apache.hadoop:hadoop-mapreduce-client-core:2.6.2') {force = true}
compile('org.apache.hadoop:hadoop-mapreduce-client-app:2.6.2') {force = true}
compile('org.apache.hadoop:hadoop-mapreduce-client-shuffle:2.6.2') {force = true}
compile('com.google.guava:guava:19.0') { force = true }
version 2.6.2
of hadoop:hadoop-mapreduce-client-core
can't be used together with guava
's new versions (I tried 17.0
- 19.0
) since guava
's StopWatch
constructor can't be accessed (causing above IllegalAccessError
)
using hadoop-mapreduce-client-core
's latest version - 2.7.2
(in which they don't use guava
's StopWatch
in the above method, rather they use org.apache.hadoop.util.StopWatch
) solved the problem, with two additional dependencies that were required:
compile('org.apache.hadoop:hadoop-mapreduce-client-core:2.7.2') {force = true} compile('org.apache.hadoop:hadoop-common:2.7.2') {force = true} // required for org.apache.hadoop.util.StopWatch compile('commons-io:commons-io:2.4') {force = true} // required for org.apache.commons.io.Charsets that is used internally
note: there are two org.apache.commons.io
packages: commons-io:commons-io (ours here), and org.apache.commons:commons-io (old one, 2007). make sure to include the correct one.
I just changed my guava version from 19.0 to 15.0 and it worked. I am currently using version spark 2.2
<dependency> <groupId>com.google.guava</groupId> <artifactId>guava</artifactId> <version>15.0</version> </dependency>
We just experienced the same situation using IntelliJ and Spark.
When using
libraryDependencies += "org.apache.spark" %% "spark-core" % "2.3.1"
libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.3.1"
com.google.guava 20.0 is downloaded and hadoop client 2.6.5 is downloaded.
The quickest solution would be to force the guava library to version 15.0 (SBT)
dependencyOverrides += "com.google.guava" % "guava" % "15.0"
I had this problem with Spark 1.6.1 because one of our additional dependencies evicted Guava 14.0.1 and replaced it with 18.0. Spark has the base dependency for hadoop-client of 2.2. See [Maven Repo] (https://mvnrepository.com/artifact/org.apache.spark/spark-core_2.10/1.6.1)
The solution that worked for is to add to sbt libraryDependencies
the following: "org.apache.hadoop" % "hadoop-client" % "2.7.2"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With