I am studing couchbase now, I am really confused by the official description of the term 'bucket' and 'vbucket', can anybody explain what exactely a bucket or vbucket is ? what's the difference? Better to make some analogies and give some examples.
A Bucket is a Couchbase specific term that is roughly analogous to a 'database' in traditional RDBMS terms. A Bucket provides a container for grouping your data, both in terms of organisation and grouping of similar data and resource allocation.
To increase or decrease bucket memory quota, use the POST /pools/default/buckets/newBucket HTTP method and URI and the ramQuotaMB option.
Bucket is a logical keyspace of uniquely keyed documents, evenly distributed across all nodes in a cluster.
vBucket is a subset of a bucket which is located on a single node. Union of all vBuckets is a bucket.
Imagine you have three nodes:
+----------+ +----------+ +----------+
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
+----------+ +----------+ +----------+
node1 node2 node3
A bucket is a set of documents (that can be different in structure and attributes) that is distributed over all three nodes but it shares the same key space.
+----------+ +----------+ +----------+
+---------------------------------------------------------------+
| | | | | | | |
| | | | | | | Bucket
| | | | | | | |
+---------------------------------------------------------------+
| | | | | |
| | | | | |
+----------+ +----------+ +----------+
node1 node2 node3
Note that a key must be unique within a bucket, which is kind of different compared to a database concept in RDBMS where a key is unique within a table.
The bucket is divided into 1024 segments which are evenly distributed across all the nodes in the cluster. These segments are virtual buckets, or vBucketes. So, in this case, on each node there are 1024/3 vBuckets.
+----------+ +----------+ +----------+
+---------------------------------------------------------------+
| | | | | | | |
| | 341 vBs | | 341 vBs | | 342 vBs | Bucket
| | | | | | | |
+---------------------------------------------------------------+
| | | | | |
| | | | | |
+----------+ +----------+ +----------+
node1 node2 node3
Each vBucket has its associated set of documents. So when the lookup is performed, clusterMap calculates the hash of the searched document's key and identifies the node and the vBucket where the document is located.
references: http://training.couchbase.com/online
Bucket is like database at RDBMS. It contains documents, views and some configurations. VBucket is like shard at RDBMS. All keys at CB mapped to #VBucket and #VBucket mapped to server-name. Thanks to these hash functions results in an even distribution of documents on multiple nodes and fast get operation of the document by its id.
You can start with Couchbase documentation, section "Architecture and Concepts" http://docs.couchbase.com/admin/admin/Concepts/concept-intro.html
For more information about buckets, see http://docs.couchbase.com/admin/admin/Concepts/concept-dataStorage.html.
For more information about vBuckets, see http://docs.couchbase.com/admin/admin/Concepts/concept-vBucket.html.
In short, bucket is an abstraction, which describes certain resources on the cluster (like RAM and disk space) and also from the API standpoint it is namespace for the documents stored in the system, similar to database in SQL world.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With