Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

saveToCassandra with spark-cassandra connector throws java.lang.ClassCastException

When trying to save data to Cassandra(in Scala), I get the following exception:

java.lang.ClassCastException: com.datastax.driver.core.DefaultResultSetFuture cannot be cast to com.google.common.util.concurrent.ListenableFuture

Please note that I do not get this error every time, but it comes up randomly once in a while which makes it more dangerous in production.

I am using YARN and I have shaded com.google.** to avoid the Guava symbol clash.

Here's the code snippet:

rdd.saveToCassandra(keyspace,"movie_attributes", SomeColumns("movie_id","movie_title","genre"))

Any help would be much appreciated.

UPDATE Adding details from the pom file as requested:

<dependency>
    <groupId>com.datastax.spark</groupId>
    <artifactId>spark-cassandra-connector_2.10</artifactId>
    <version>1.5.0</version>
</dependency>
<dependency>
    <groupId>com.datastax.spark</groupId>
    <artifactId>spark-cassandra-connector-java_2.10</artifactId>
    <version>1.5.0</version>
</dependency>

**Shading guava**

<relocation> <!-- Conflicts between Cassandra Java driver and YARN -->
    <pattern>com.google</pattern>
    <shadedPattern>oryx.com.google</shadedPattern>
    <includes>
         <include>com.google.common.**</include>
    </includes>
 </relocation>

Spark version: 1.5.2 Cassandra version: 2.2.3

like image 764
neeraj baji Avatar asked May 18 '16 12:05

neeraj baji


1 Answers

Almost everyone who works on C* and Spark has seen these type of errors. The root cause is explained here.

C* driver depends on a relatively new version of guava while Spark depends on an older guava. To solve this before connector 1.6.2, you need to explicitly embed C* driver and guava with your application.

Since 1.6.2 and 2.0.0-M3, by default connector ships with the correct C* driver and guava shaded. So you should be OK with just connector artifact included in your project.

Things get tricky if your Spark application uses other libraries that depend on C* driver. Then you will have to manually include un-shaded version of connector, correct C* driver and shaded guava and deploy a fat jar. You essentially make your own connector package. In this case, you can't use --package to launch Spark cluster anymore.

tl;dr

use connector 1.6.2/2.0.0-M3 or above. 99% you should be OK.

like image 82
treehouse Avatar answered Oct 14 '22 19:10

treehouse