Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

RPC timeout error while exporting data from CQL

Tags:

cassandra

cql3

I am trying to export data from cassandra using CQL client. A column family has about 100000 rows in it. when i am copying dta into csv file using COPY TO command i get following rpc_time out error.

copy mycolfamily to '/root/mycolfamily.csv'
Request did not complete within rpc_timeout.

I am running in:

[cqlsh 3.1.6 | Cassandra 1.2.8 | CQL spec 3.0.0 | Thrift protocol 19.36.0]

How can I increase RPC timeout limit?

I tried adding rpc_timeout_in_ms: 20000 (defalut is 10000) in my conf/cassandra.yaml file. but while restarting cassandra I get:

[root@user ~]# null; Can't construct a java object for tag:yaml.org,2002:org.apache.cassandra.config.Config; exception=Cannot create property=rpc_timeout_in_ms for JavaBean=org.apache.cassandra.config.Config@71bfc4fc; Unable to find property 'rpc_timeout_in_ms' on class: org.apache.cassandra.config.Config
Invalid yaml; unable to start server.  See log for stacktrace.
like image 268
devThoughts Avatar asked Sep 18 '13 12:09

devThoughts


3 Answers

The COPY command currently does the same thing with SELECT with LIMIT 99999999. So, it will eventually goes to timeout while your data is growing. Here's the export function;

https://github.com/apache/cassandra/blob/trunk/bin/cqlsh#L1524

I'm doing the same export on production. What I'm doing is the following;

  • make select * from table where timeuuid = someTimeuuid limit 10000
  • write the result set to a csv file w/ >> mode
  • make the next selects with respect to the last timeuuid

You can pipe command in cqlsh by the following cqlsh command

echo "{$cql}" | /usr/bin/cqlsh -u user -p password localhost 9160 > file.csv

like image 172
aacanakin Avatar answered Nov 03 '22 00:11

aacanakin


You can use Auto pagination by specifying fetch size in Datastax Java driver.

Statement stmt = new SimpleStatement("SELECT id FROM mycolfamily;"); 
stmt.setFetchSize(500); 
session.execute(stmt); 
for (Row r:result.all()){
    //write to file
}
like image 2
akshat thakar Avatar answered Nov 03 '22 01:11

akshat thakar


I have encountered the same problem a few minutes ago then I have found CAPTURE and it worked:

First start capturing on cqlsh and then run your query with some limiting of your choice.

http://www.datastax.com/documentation/cql/3.0/cql/cql_reference/capture_r.html

like image 1
safato Avatar answered Nov 03 '22 00:11

safato