Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Running spark job using Yarn giving error:com.google.common.util.concurrent.Futures.withFallback

I am trying run spark job using yarn,but getting below error

java.lang.NoSuchMethodError: com.google.common.util.concurrent.Futures.withFallback(Lcom/google/common/util/concurrent/ListenableFuture;Lcom/google/common/util/concurrent/FutureFallback;Ljava/util/concurrent/Executor;)Lcom/google/common/util/concurrent/ListenableFuture;
at com.datastax.driver.core.Connection.initAsync(Connection.java:176)
at com.datastax.driver.core.Connection$Factory.open(Connection.java:721)
at com.datastax.driver.core.ControlConnection.tryConnect(ControlConnection.java:248)
at com.datastax.driver.core.ControlConnection.reconnectInternal(ControlConnection.java:194)
at com.datastax.driver.core.ControlConnection.connect(ControlConnection.java:82)
at com.datastax.driver.core.Cluster$Manager.init(Cluster.java:1307)
at com.datastax.driver.core.Cluster.init(Cluster.java:159)
at com.datastax.driver.core.Cluster.connect(Cluster.java:249)
at com.figmd.processor.ProblemDataloader$ParseJson.call(ProblemDataloader.java:46)
at com.figmd.processor.ProblemDataloader$ParseJson.call(ProblemDataloader.java:34)
at org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$4$1.apply(JavaRDDLike.scala:140)
at org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$4$1.apply(JavaRDDLike.scala:140)
at org.apache.spark.rdd.RDD$$anonfun$14.apply(RDD.scala:618)
at org.apache.spark.rdd.RDD$$anonfun$14.apply(RDD.scala:618)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:280)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:247)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
at org.apache.spark.scheduler.Task.run(Task.scala:56)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:200)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

cluster Details: Spark 1.2.1,hadoop 2.7.1 I have provided class path using spark.driver.extraClassPath. hadoop user has access to that class path as well.But I think yarn is not getting the JAR's on that classpath. I am not able to reach root cause of it.Any help will be appreciated.

Thanks.

like image 817
Abhinandan Satpute Avatar asked May 29 '26 18:05

Abhinandan Satpute


1 Answers

I faced the same problem, and the solution was shade guava to avoid classpath collision.

If you're using sbt assembly to build your jar, you can just add this to your build.sbt:

assemblyShadeRules in assembly := Seq(
  ShadeRule.rename("com.google.**" -> "shadeio.@1").inAll
)

I wrote a blog post which describes my process to arrive to this solution: Making Hadoop 2.6 + Spark-Cassandra Driver Play Nice Together.

Hope it helps!

like image 170
arjones Avatar answered Jun 01 '26 18:06

arjones



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!