Probably there are a lot of similar questions but they dont' answer to my scenario (at least I'm not able to get the point).
I have, lets say, a table in HBase with 4 column families. Main reason is that each column family has different VERSIONS attribute (very different).
All column of all families are not storing big data (such for example fulltexts) but an average of 1KB (identifiers that are long, some short strings, integers and so on)
I need to access data in several ways: scan and get by column family, get all cells of a given row by version (specific version or a range), and last but not least: get the latest version of all columns of a given row.
So, what are, in this scenario, the disadvantages of having 4 column families? Does reads are less efficient because they operate (in case the row is not in memory) on different store files?
As per Apache HBase wiki Hbase will face performance issues more than 2 or 3 Column families.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With