Does HBase impose a maximum size per row which is common to all distributions (and thus not an artifact of implementation), either in terms of bytes-stored or in terms of number of cells?
If so:
What is the limit?
What is the reason the limit exists?
Where is the limit documented?
If not:
Is documentation (or results of a test) available demonstrating the ability of HBase to handle rows in excess of 2GB? 4GB?
Is there a practical or "best practice" maximum under which HBase API users should keep row sizes in order to avoid severe performance degradation? If so, what kind of performance degradation can occur if that guidance is discarded?
In either case:
One row must be fit into one Region file to be assigned to a region server and replicated. Region file size is configurable by "hbase.hregion.max.filesize"
this page says it will be 10gb default/max http://hortonworks.com/blog/apache-hbase-region-splitting-and-merging/
This page says it can be set as 100gb
To disable automatic splitting, set hbase.hregion.max.filesize to a very large value, such as 100 GB It is not recommended to set it to its absolute maximum value of Long.MAX_VALUE. http://hbase.apache.org/book.html#important_configurations
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With