Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cassandra InvalidRequestException(why:[MyKeyspace][MyColumnFamily][6675...6c74] = [6c86......e65720] failed validation (String didn't validate.))

I am using Cassandra with Hadoop for input and output. During the output reduce job, I got an error:

2011-08-10 03:54:04,326 WARN org.apache.hadoop.mapred.Child: Error running child
java.io.IOException: InvalidRequestException(why:[MyKeyspace][MyColumnFamily][66756c6c74657874] = [6c696e6bb66e68656974207a756d.................65697465726520536f6e67746578746520] failed    validation (String didn't validate.))
at org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:19045)
at org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:1035)
at org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:1009)
at   org.apache.cassandra.hadoop.ColumnFamilyRecordWriter$RangeClient.run(ColumnFamilyRecordWriter.java:285)
2011-08-10 03:54:04,339 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for the task
2011-08-10 03:54:04,340 WARN org.apache.hadoop.io.UTF8: truncating long string: 267364 chars, starting with java.io.IOException:

According to the log, this happens not in the beginning, but after successful merging, sorting, and processing of 8981 keys. On 8982th it fails.

Have searched in google and on stackoverflow, but nothig found.

Column family is like this:

create column family MyColumnFamily with comparator = UTF8Type and                                                                                        
key_validation_class=UTF8Type and 
column_metadata = 
[
{column_name: column1, validation_class: UTF8Type, index_type: 0},
{column_name: column2, validation_class: UTF8Type, index_type: 0},
{column_name: column3, validation_class: UTF8Type, index_type: 0}
];

Thank you in advance!

like image 743
Anton Avatar asked Aug 10 '11 09:08

Anton


1 Answers

That means one of your column values was not actually a valid UTF8-encoded string. The first hex string in the message is the column name in bytes, and the second is the bytes that couldn't be decoded.

like image 152
jbellis Avatar answered Oct 04 '22 19:10

jbellis