Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to trigger or check status of chained map reduce (dbcopy)

With standard CouchDB view indexes, I have flexibility and introspection into staleness vs. freshness. How do I get the analogous functionality for Cloudant's dbcopy feature?

CouchDB view query freshness

  • current index on disk, possibly stale: stale=ok
  • current index on disk, but trigger updating: stale=update_after
  • up to date index, even if that requires updating the index: leave off stale flag (a.k.a. stale=false)

CouchDB view freshness introspection

I can compare the DB's update_seq with the design doc's update_seq, which can be obtained with update_seq=true in a view query or from GET /db/_design/foo/_info.

BigCouch caveats

This is slightly clouded by BigCouch's DB partitioning and multiple servers. E.g. update_seq is a composite and should be only compared within a tolerance range; stale=false might choose a different shard than stale=ok which might be more or less up to date; although there isn't a way to get the update_seq for all nodes (or for the specific node(s) that would be chosen by stale=false queries) it can be cheated by quickly issuing multiple /db/_design/foo/_info queries. It would be nice to have additional shard/partition introspection here, but the above still works for my purposes.

Cloudant's dbcopy

dbcopy has roughly the same "eventual consistency" characteristics. Querying the docs in chained DB is roughly analogous to querying the origin view with group=true&stale=ok. Which is fine, most of the time. But the documentation doesn't give any pointers on the following:

  • How can I query the current dbcopy state? E.g. Does the DB consider itself up to date or are view changes waiting their turn in the IOQ? If it's not up to date, roughly how stale is it?
  • How can I trigger or bump up the priority of the dbcopy (as in stale=update_after or stale=false). E.g. I want something along the lines of POST /origin_db/_design/foo/_view/bar/_dbcopy that will forcibly push the reduced results to the dbcopy DB immediately (optionally updating the origin view first).
  • If the chained DB somehow gets out of sync (e.g. documents are deleted or updated directly in the DB rather than by the dbcopy mechanism or the dbcopy mechanism misses a few documents), can this be detected? How can it be corrected? Is there a dbcopy "reset button"?
like image 855
nicholas a. evans Avatar asked Nov 02 '22 21:11

nicholas a. evans


1 Answers

How can I query the current dbcopy state? E.g. Does the DB consider itself up to date or are view changes waiting their turn in the IOQ? If it's not up to date, roughly how stale is it?

We are looking into a better method, but at the moment the only way to get the dbcopy's current state is to compare the records in the view to the documents in the target database.

Is there a dbcopy "reset button"?

You can re-trigger the dbcopy by forcing a rebuild of the source view. This can be done by updating the design doc so that the view signature changes - e.g. adding whitespace or comments to an existing view. This is somewhat inelegant but would result in the dbcopy being re-run.

like image 167
garbados Avatar answered Nov 09 '22 06:11

garbados