Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Converting cassandra blob type to string

I have an old column family which has a column named "value" which was defined as a blob data type. This column usually holds two numbers separated with an underscore, like "421_2".

When im using the python datastax driver and execute the query, the results return with that field parsed as a string:

In [21]: session.execute(q)
Out[21]: 
[Row(column1=4776015, value='145_0'),
 Row(column1=4891778, value='114_0'),
 Row(column1=4891780, value='195_0'),
 Row(column1=4893662, value='105_0'),
 Row(column1=4893664, value='115_0'),
 Row(column1=4898493, value='168_0'),
 Row(column1=4945162, value='148_0'),
 Row(column1=4945163, value='131_0'),
 Row(column1=4945168, value='125_0'),
 Row(column1=4945169, value='211_0'),
 Row(column1=4998426, value='463_0')]

When I use the java driver I get a com.datastax.driver.core.Row object back. When I try to read the value field by, for example, row.getString("value") I get the expected InvalidTypeException: Column value is of type blob. Seems like the only way to read the field is via row.getBytes("value") and then I get back an java.nio.HeapByteBuffer object.

Problem is, I cant seem to convert this object to string in an easy fashion. Googling yielded two answers from 2012 that suggest the following:

String string_value = new String(result.getBytes("value"), "UTF-8");

But such a String constructor doesn't seems to exist anymore. So, my questions are:

  1. How do I convert HeapByteBuffer into string?
  2. How come the python driver converted the blob easily and the java one did not?

Side Note: I could debug the python driver, but currently that seems too much work for something that should be trivial. (and the fact that no one asked about it suggests Im missing something simple here..)

like image 887
idoda Avatar asked Aug 04 '15 17:08

idoda


3 Answers

Another easier way is to change the CQL statement.

select column1, blobastext(value) from YourTable where key = xxx

The second column would be type of String.

like image 124
popcorny Avatar answered Nov 19 '22 21:11

popcorny


You can also get direct access to the Java driver's serializers. This way you don't have to deal with low-level details, and it also works for other types.

Driver 2.0.x:

String s = (String)DataType.text().deserialize(byteBuffer);

Driver 2.1.x:

ProtocolVersion protocolVersion = cluster.getConfiguration().getProtocolOptions().getProtocolVersion();
String s = (String)DataType.text().deserialize(byteBuffer, protocolVersion);

Driver 2.2.x:

ProtocolVersion protocolVersion = cluster.getConfiguration().getProtocolOptions().getProtocolVersion();
String s = TypeCodec.VarcharCodec.instance.deserialize(byteBuffer, protocolVersion);
like image 44
Olivier Michallat Avatar answered Nov 19 '22 20:11

Olivier Michallat


For version 3.1.4 of the datastax java driver the following will convert a blob to a string:

ProtocolVersion proto = cluster.getConfiguration().getProtocolOptions().getProtocolVersion();

String deserialize = TypeCodec.varchar().deserialize(row.getBytes(i), proto);
like image 1
Jenna Avatar answered Nov 19 '22 20:11

Jenna