Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parallel/Redundant Replication in CouchDB

I have multiple CouchDB servers I want to keep in sync with each other, and I use these servers to share large files (e.g. >100 MB). To keep them synchronized, I have each CouchDB instance do a continuous pull replication from each other instance.

Here's an example: I have three CouchDB servers A, B, & C, all of which have continuous pull replications from each other, as so:

------- <------------- -------
|  A  | -------------> |  B  |
-------                -------
  ^ |                   | ^
  | |                   | |
  | V                   | |
------- <---------------- |
|  C  | -------------------
-------

Someone uploads a document to server A with a 500MB attachment. B and C both start replicating the document from A, and B finishes the replication before C does:

-------    doc         -------
|  A  |--------------->|  B  |
-------                -------
   |
   | doc
   V
-------
|  C  |
-------

My question is, will C then start replicating the same document from B (since C also has a continuous pull replication from B), while it is still transferring the document from A?

-------                -------
|  A  |                |  B  |
-------                -------
   |          doc         |
doc|    |------------------
   |    |
   V    V
  -------
  |  C  |
  -------                           

I would guess this would happen, since AFAIK, CouchDB replication doesn't actually store the replicated documents to the target (using the _bulk_docs API) until the documents (including attachments) have been fully fetched from the source[1]. I am worried about this happening since it would be redundant and a big waste of bandwidth.

[1] https://github.com/couchbaselabs/TouchDB-iOS/wiki/Replication-Algorithm

like image 471
Dan S Avatar asked Jan 01 '13 23:01

Dan S


1 Answers

According to a recent discussion on the CouchDB users@ list and to this document describing the replication algorithm the replication knows which attachment is already present on the target. If, however, the attachments are very large and both ends start replicating before either of them has finished, the attachment will be transferred multiple times.

like image 61
Stefan Kögl Avatar answered Nov 02 '22 11:11

Stefan Kögl