Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

java.lang.UnsupportedOperationException: 'Writing to a non-empty Cassandra Table is not allowed

I have scenario where i will be receiving streaming data which is processed by my spark streaming program and the output for each interval is being appended to my existing cassandra table.

Currently my spark streaming program will generate a data frame which i need to save in my cassandra table. The problem i'm currently facing is i'm not able to append data/rows into my existing cassandra table when i use below command

dff.write.format("org.apache.spark.sql.cassandra").options(Map("table" -> "xxx", "yyy" -> "retail")).save()

I had read in following link http://rustyrazorblade.com/2015/08/migrating-from-mysql-to-cassandra-using-spark/ where he passed mode="append" into save method but its throwing syntax error

Also i was nt able to understand where do i need to fix from the below link https://groups.google.com/a/lists.datastax.com/forum/#!topic/spark-connector-user/rlGGWQF2wnM

Need help as how to fix this issue.I'm writing my spark streaming jobs in scala

like image 377
bigdata123 Avatar asked Feb 11 '16 06:02

bigdata123


1 Answers

I think you have to do it the following way:

dff.write.format("org.apache.spark.sql.cassandra").mode(SaveMode.Append).options(Map("table" -> "xxx", "yyy" -> "retail")).save()

The way cassandra handles data forces you to do so-called 'upserts' - you have to remember that an insert may overwrite some of the rows where the primary key of already stored record is the same as a primary key of inserted reccord. Cassandra is a 'write-fast' database, so it does not check for data existence before writing.

like image 117
TheMP Avatar answered Oct 16 '22 06:10

TheMP