Column families for the same row belong to the same RegionServer. So, the question here is will a RegionServer store different column families in different machine?
An HBase table is made of column families which are the logical and physical grouping of columns. The columns in one family are stored separately from the columns in another family. If you have data that is not often queried, assign that data to a separate column family.
Technically, HBase can manage more than three of four column families. However, you need to understand how column families work to make the best use of them.
HBase is a column-oriented database and the tables in it are sorted by row. The table schema defines only column families, which are the key value pairs. A table have multiple column families and each column family can have any number of columns. Subsequent column values are stored contiguously on the disk.
HBase stores rows of data in tables. Tables are split into chunks of rows called “regions”. Those regions are distributed across the cluster, hosted and made available to client processes by the RegionServer process.
Not neccessarily, but at some point it will. This is part of the basic HBase architecture. If you imaging a HBase table as being a spreadsheet, with its rows and columns, then a region spans multiple successive rows in one direction and all columns of one or more column family. This way, the whole sheet is covered with region tiles.
Each region is stored on one or more (typically three) cluster nodes. (If you'd loose all nodes containing a specific region at once you'd loose all the region's data. If you'd only loose one replica, HBase makes sure it is replicated to another node from the remaining copies.)
Now, when the data contained in a region grows too big, a region split is automatically initiated by HBase, resulting in two new regions, each containing on half of the data. Only through region splits (besides region replication) data gets distributed over a HBase cluster eventually.
Storing data for one row in different columns of the same column family assures that the data is stored together at one place.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With