I'm wondering what is actually stored in a CouchDB database B-tree? The CouchDB: The Definitive Guide tells that a database B-tree is used for append-only operations and that a database is stored in a single B-tree (besides per-view B-trees).
So I guess the data items that are appended to the database file are revisions of documents, not the whole documents:
+---------|### ...
| |
+------|###|------+ ... ---+
| | | |
+------+ +------+ +------+ +------+
| doc1 | | doc2 | | doc1 | ... | doc1 |
| rev1 | | rev1 | | rev2 | | rev7 |
+------+ +------+ +------+ +------+
Is it true?
If it is true, then how the current revision of a document is determined based on such a B-tree?
Doesn't it mean, that CouchDB needs a separate "view" database for indexing current revisions of documents to preserve O(log n) access? Wouldn't it lead to race conditions while building such an index? (as far as I know, CouchDB uses no write locks).
The database file on disk is append-only; however the B-tree is conceptually modified in-place. When you update a document,
When the root node is written, that is effectively when the newer revision is "committed." To find a document, you start at the end of the file, get the root node, and work down to your doc id. The latest revision will always be accessible this way.
CouchDB does not store diffs. When you update a document, it appends the whole new document with a new _rev and the same _id as the old version. The old version is removed during compaction.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With