Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

IllegalAccessError to guava's StopWatch from org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus

I'm trying to run small spark application and am getting the following exception:

Exception in thread "main" java.lang.IllegalAccessError: tried to access method com.google.common.base.Stopwatch.<init>()V from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat
    at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:262)
    at org.apache.hadoop.mapreduce.lib.input.CombineFileInputFormat.getSplits(CombineFileInputFormat.java:217)
    at org.apache.spark.rdd.NewHadoopRDD.getPartitions(NewHadoopRDD.scala:95)
    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219)
    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217)
    at scala.Option.getOrElse(Option.scala:120)
    at org.apache.spark.rdd.RDD.partitions(RDD.scala:217)
    at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32)
    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219)
    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217)
    at scala.Option.getOrElse(Option.scala:120)

the relevant gradle dependencies section:

compile('org.apache.spark:spark-core_2.10:1.3.1')
compile('org.apache.hadoop:hadoop-mapreduce-client-core:2.6.2') {force = true}
compile('org.apache.hadoop:hadoop-mapreduce-client-app:2.6.2') {force = true}
compile('org.apache.hadoop:hadoop-mapreduce-client-shuffle:2.6.2') {force = true}
compile('com.google.guava:guava:19.0') { force = true }
like image 676
Lika Avatar asked Apr 05 '16 13:04

Lika


4 Answers

version 2.6.2 of hadoop:hadoop-mapreduce-client-core can't be used together with guava's new versions (I tried 17.0 - 19.0) since guava's StopWatch constructor can't be accessed (causing above IllegalAccessError)

using hadoop-mapreduce-client-core's latest version - 2.7.2 (in which they don't use guava's StopWatch in the above method, rather they use org.apache.hadoop.util.StopWatch) solved the problem, with two additional dependencies that were required:

compile('org.apache.hadoop:hadoop-mapreduce-client-core:2.7.2') {force = true}  compile('org.apache.hadoop:hadoop-common:2.7.2') {force = true} // required for org.apache.hadoop.util.StopWatch    compile('commons-io:commons-io:2.4') {force = true} // required for org.apache.commons.io.Charsets that is used internally 

note: there are two org.apache.commons.io packages: commons-io:commons-io (ours here), and org.apache.commons:commons-io (old one, 2007). make sure to include the correct one.

like image 142
Lika Avatar answered Oct 15 '22 09:10

Lika


I just changed my guava version from 19.0 to 15.0 and it worked. I am currently using version spark 2.2

<dependency>         <groupId>com.google.guava</groupId>         <artifactId>guava</artifactId>         <version>15.0</version>       </dependency> 
like image 36
pranaygoyal02 Avatar answered Oct 15 '22 09:10

pranaygoyal02


We just experienced the same situation using IntelliJ and Spark.

When using

libraryDependencies += "org.apache.spark" %% "spark-core" % "2.3.1"
libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.3.1"

com.google.guava 20.0 is downloaded and hadoop client 2.6.5 is downloaded.

The quickest solution would be to force the guava library to version 15.0 (SBT)

dependencyOverrides += "com.google.guava" % "guava" % "15.0"
like image 36
Carlos David Peña Avatar answered Oct 15 '22 10:10

Carlos David Peña


I had this problem with Spark 1.6.1 because one of our additional dependencies evicted Guava 14.0.1 and replaced it with 18.0. Spark has the base dependency for hadoop-client of 2.2. See [Maven Repo] (https://mvnrepository.com/artifact/org.apache.spark/spark-core_2.10/1.6.1)

The solution that worked for is to add to sbt libraryDependencies the following: "org.apache.hadoop" % "hadoop-client" % "2.7.2"

like image 29
ekrich Avatar answered Oct 15 '22 09:10

ekrich