What is the most efficient way of inserting multiple rows in cassandra column family. Is it possible to do this in a single call.
Right now my approach is to addinsert multiple column and then execute. There in a single call I am persisting one row. I am looking for strategy so that I can do a batch insert.
In Cassandra BATCH is used to execute multiple modification statements (insert, update, delete) simultaneously. It is very useful when you have to update some column as well as delete some of the existing.
Cassandra allows 2 billion columns per row.
CQL contains a BEGIN BATCH...APPLY BATCH
statement that allows you to group multiple inserts so that a developer can create and execute a series of requests
(see http://www.datastax.com/dev/blog/client-side-improvements-in-cassandra-2-0).
The following worked for me (Scala):
PreparedStatement ps = session.prepare(
"BEGIN BATCH" +
"INSERT INTO messages (user_id, msg_id, title, body) VALUES (?, ?, ?, ?);" +
"INSERT INTO messages (user_id, msg_id, title, body) VALUES (?, ?, ?, ?);" +
"INSERT INTO messages (user_id, msg_id, title, body) VALUES (?, ?, ?, ?);" +
"APPLY BATCH" );
session.execute(ps.bind(uid, mid1, title1, body1, uid, mid2, title2, body2, uid, mid3, title3, body3));
If you don't know in advance which statements you want to execute, you can use the following syntax (Scala):
var statement: PreparedStatement = session.prepare("INSERT INTO people (name,age) VALUES (?,?)")
var boundStatement = new BoundStatement(statement)
val batchStmt = new BatchStatement()
batchStmt.add(boundStatement.bind("User A", "10"))
batchStmt.add(boundStatement.bind("User B", "12"))
session.execute(batchStmt)
Note: BatchStatement
can only hold up to 65536 statements. I learned that the hard way. :-)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With