As there are two ways to support wide rows in CQL3..One is to use composite keys and another is to use collections like Map, List and Set. The composite keys method can have millions of columns (transposed to rows).. This is solving some of our use cases.
However, if we use collections, I want to know if there is a limit that the collections can store a certain number/amount of data (Like earlier with Thrift C* supports up-to 2 billion columns in a row)
Apart from the performance issue, there is a protocol issue which limits the number of items you can access to 65536.
http://mail-archives.apache.org/mod_mbox/cassandra-user/201305.mbox/%3CCAENxBwx6pcSA=cWn=dKW_52K5odw5F3Xigj-zn_4BwFth+4ruA@mail.gmail.com%3E
It is strongly recommended to store only a limited amount of data in collections & maps.
The reasons:
Collections and maps are fetched as a whole, entirely. You can not "slice" on collections so putting lots of data in collections/maps will have impact on perf when reading them
The CQL3 implementation of Lists is not performant for insertion/removal in the middle of the list. For append/prepend operations, it's quite fast. For insertion/removal element at index i, it will require a read-before-write. Basically, part of the list will be re-written because they need to be shifted to the good index
Insertion/removal for Set and Map are more performant since they use the column key for storage/sorting/indexing
Now to answer to your question, is there a hard limit on the number of elements in a collection/map, the answer is no, technically there is no limit other than the classical 2 billions limit that already exist in Thrift yes, it is limited to 65536 as mentioned above by GlynD.
The related JIRA CASSANDRA-5428
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With