I'm relatively new to NoSQL, but I've done a fair bit of toying with relational databases.
We are evaluating Cassandra for use in an environment where our data model might need to evolve fairly aggressively. I've seen it written multiple places that Cassandra can store "structured, semi-structured and unstructured" data.
I understand the structured claim. It's obvious: a table has defined columns.
I think I understand the semi-structured claim. A row does not need to populate all columns.
But I'm not clear on the unstructured claim. Certainly you could store everything as a key-value blob but you'd have no means of searching by value (efficiently).
I've failed to find any resource on the net that describes best practices using unstructured data with Cassandra. Ideally, for our application semi-structured data would be sufficient; but I want to understand the unstructured claim in the event that it can add value for us.
Thanks.
Cassandra can at best be searchable for semi-structured data. That too via use of clustering keys and secondary indexes. Clustering keys is definitely an efficient way for searching semi-structured data.
Searching secondary indexed data without specifying the partition key is not efficient. There a few solutions which help help here namely DSE Search(Solr with Cassandr) and Stargate. Both of these solutions may also help in case one of the columns is unstructured text.
Otherwise it isn't a great idea to do unstructured data with Cassandra as it may not be searchable without a key.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With