Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to create complex structure in Cassandra with CQL3

I have problem with presenting complex data structure in cassandra. JSON example of data :

{
  "A": {
    "A_ID" : "1111"
    "field1": "value1",
    "field2": "value2",
    "field3": [
      {
        "id": "id1",
        "name": "name1",
        "segment": [
          {
            "segment_id": "segment_id_1",
            "segment_name": "segment_name_1",
            "segment_value": "segment_value_1"
          },
          {
            "segment_id": "segment_id_2",
            "segment_name": "segment_name_2",
            "segment_value": "segment_value_2"
          },
          ...
        ]
      },
      {
        "id": "id2",
        "name": "name2",
        "segment": [
          {
            "segment_id": "segment_id_3",
            "segment_name": "segment_name_3",
            "segment_value": "segment_value_3"
          },
          {
            "segment_id": "segment_id_4",
            "segment_name": "segment_name_4",
            "segment_value": "segment_value_4"
          },
          ...
        ]
      },
      ...
    ]
  }
}

Will be used only one query: Find by A_ID.

I think this data should store in one TABLE (Column Family) and without serialization/deserialization operations for more efficiency. How can I do this if CQL does not support nested maps and lists?

like image 256
Sergey Mikitko Avatar asked Nov 12 '13 10:11

Sergey Mikitko


2 Answers

Cassandra 2.1 adds support for nested structures: https://issues.apache.org/jira/browse/CASSANDRA-5590

The downside to "just store it as a json/protobuf/avro/etc blob" is that you have to read-and-rewrite the entire blob to update any field. So at the very least you should pull your top level fields into Cassandra columns, leveraging collections as appropriate.

like image 88
jbellis Avatar answered Sep 22 '22 12:09

jbellis


As you will be using it just as a key/value, you could actually store it either as JSON, or for saving data more efficiently, something like BSON or event Protobuf.

I personally would store it in the Protobuf record, as it doesn't save the field names which may be repeating in your case.

like image 45
abatyuk Avatar answered Sep 19 '22 12:09

abatyuk