Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does couchdb retrieve all previous revisions?

From what I understand, CouchDB's Btree implementation actually uses Shadowing technique, and every update will produce new root, the following excerpts from this PDF (it looks like implementing a better algorithm than traditional shadowing).

Shadowing means that to update an on-disk page, the entire page is read into memory, modified, and later written to disk at an alternate location. When a page is shadowed its location on disk changes, this creates a need to update (and shadow) the immediate ancestor of the page with the new address. Shadowing propagates up to the file system root.

How does couchdb implement fetching all leaf revisions as possible as it can( since some revisions are removed by compacting process)? Does couch internally store a pointer which points to previous revisions?

Thanks Chang

like image 507
Chang Avatar asked Jun 22 '11 13:06

Chang


2 Answers

Couch doesn't guarantee that old revisions of a document can be retrieved:

The terms version and revision might sound familiar (if you are programming without version control, drop this book right now and start learning one of the popular systems). Using new versions for document changes works a lot like version control, but there’s an important difference: CouchDB does not guarantee that older versions are kept around.

Source: O'Reilly CouchDB The Definitive Guide, page 40.

Why is this? Because CouchDB is not a version control system: the versioning mechanism is there for concurrent access to the database. The Definitive Guide touches on this on pages 14-15.

like image 195
Chris Avatar answered Sep 29 '22 20:09

Chris


By good luck, CouchDB committer and community leader Adam Kocoloski explained this recently on the mailing list.

Here is what he said:

"Each leaf in the ID btree [stores] a revision tree containing pointers to all available revisions of a document. Retrieving an old revision (before compaction) or a conflicting version of a document requires exactly the same number of IOs as retrieving the current one."

If I understand correctly, shadowing is not used to conceal old document revisions at all, but rather entire revision trees that are no longer meaningful.

like image 42
JasonSmith Avatar answered Sep 29 '22 18:09

JasonSmith