I want to use Cassandra in a .Net application. My objective is to store some data in a column family, but each row of data will have varying schema.
Example (A very simple one) I want to have a 'Toys' column family to store the following objects, (Notice how they have very different properties other than the ID property)
Toy object 1 { "id":"1", "name":"Car", "number_of_doors":4, "likes":3}
Toy object 2 { "id":"2", "type":"Plane", "flying_range":"100m"}
Toy object 3 { "id":"3", "category":"Train", "number_of_carriages":10}
From my initial understanding and using of Datastax CSharp driver I have to always alter the table (Column family) which does not sit right with me. I would like each row to have its own schema. Thrift API might be able to solve this but it seems HectorSharp is all but dead.
A question similar to my requirement but it doesn't have the answer I want
Cassandra for a schemaless db, 10's of millions order tables and millions of queries per day
Am I barking up the wrong tree by expecting each row to have its own schema or is there a way to do this using Cassandra+Csharp ?
Thanks in advance for your answers.
Older versions of Cassandra were Schema-less, meaning that you didn't have anywhere a definition of what a row could contain. What you need now could be partially done with a Map
on Cassandra 2.1
CREATE TABLE toys (
id text PRIMARY KEY,
toy map<text, text>
)
Put some data ...
INSERT INTO toys (id, toy) VALUES ( '1', {'name':'Car', 'number_of_doors':'4', 'likes':'3'});
INSERT INTO toys (id, toy) VALUES ( '2', {'type':'Plane', 'flying_range':'100m'});
INSERT INTO toys (id, toy) VALUES ( '3', {'category':'Train', 'number_of_carriages':'10'});
Table content ...
id | toy
----+-------------------------------------------------------
3 | {'category': 'Train', 'number_of_carriages': '10'}
2 | {'flying_range': '100m', 'type': 'Plane'}
1 | {'likes': '3', 'name': 'Car', 'number_of_doors': '4'}
We can now create an index on keys ...
CREATE INDEX toy_idx ON toys (KEYS(toy));
... and perform queries on Map keys ...
SELECT * FROM toys WHERE toy CONTAINS KEY 'name';
id | toy
----+-------------------------------------------------------
1 | {'likes': '3', 'name': 'Car', 'number_of_doors': '4'}
Now you can update or delete map entries like you would do with normal columns, without reading before writing
DELETE toy['name'] FROM toys WHERE id='1';
UPDATE toys set toy = toy + {'name': 'anewcar'} WHERE id = '1';
SELECT * FROM toys;
id | toy
----+-----------------------------------------------------------
3 | {'category': 'Train', 'number_of_carriages': '10'}
2 | {'flying_range': '100m', 'type': 'Plane'}
1 | {'likes': '3', 'name': 'anewcar', 'number_of_doors': '4'}
A few limitations
I personally consider an extensive usage of this approach an anti-pattern.
HTH, Carlo
To add to Carlo's answer:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With