Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Delete all documents in a CouchDB database *except* the design documents

Tags:

couchdb

Is it possible to delete all documents in a couchdb database, except design documents, without creating a specific view for that?

My first approach has been to access the _all_docs standard view, and discard those documents starting with _design. This works but, for large databases, is too slow, since the documents need to be requested from the database (in order to get the document revision) one at a time.

If this is the only valid approach, I think it is much more practical to delete the complete database, and create it from scratch inserting the design documents again.

like image 717
blueFast Avatar asked Apr 14 '12 17:04

blueFast


People also ask

How do I delete files from CouchDB?

You can delete a document in CouchDB by sending an HTTP request to the server using DELETE method through cURL utility. Following is the syntax to delete a document. Using −X, we can specify a custom request method of HTTP we are using, while communicating with the HTTP server. In this case, we are using Delete method.

What is CouchDB design document?

Design documents are a special type of CouchDB document that contains application code. Because it runs inside a database, the application API is highly structured. We've seen JavaScript views and other functions in the previous chapters.

Is CouchDB document database?

CouchDB is a document storage NoSQL database. It provides the facility of storing documents with unique names, and it also provides an API called RESTful HTTP API for reading and updating (add, edit, delete) database documents. In CouchDB, documents are the primary unit of data and they also include metadata.

Does CouchDB have collections?

Couchdb does not have the concept of collections. However, you can achieve similar results using type identifiers on your documents in conjunction with Couchdb views. When you save a document in Couchdb add a field that specifies the type.


1 Answers

I can think of a couple of ideas.

Use _all_docs

You do not need to fetch all the documents, only the ID and revisions. By default, that is all that _all_docs returns. You can make a pretty big request in a batch (10k or 100k docs at a time should be fine).

Replicate then delete

You could use an _all_docs query to get the IDs of all design documents.

GET /db/_all_docs?startkey="_design/"&endkey="_design0"

Then replicate them somewhere temporary.

POST /_replicator

{ "source":"db", "target":"db_ddocs", "create_target":true
, "user_ctx": {"roles":["_admin"]}
, "doc_ids": ["_design/ddoc_1", "_design/ddoc_2", "etc..."]
}

Now you can just delete the original database and replicate the temporary one back by swapping the "source" and "target" values.

Deleting vs "deleting"

Note, these are really apples vs. oranges techniques. By deleting a database, you are wiping out the edit history of all its documents. In other words, you cannot replicate those deletion events to any other database. When you "delete" a document in CouchDB, it stores a record of that deletion. If you replicate that database, those deletions will be reflected in the target. (CouchDB stores "tombstones" indicating the document ID, its revision history, and its deleted state.)

That may or may not be important to you. The first idea is probably considered more "correct" however I can see the value of the second. You can visualize the entire program to accomplish this in your head. It's only a few queries and you're done. No looping through _all_docs batches, no headache. Your specific situation will probably make it obvious which is better.

like image 163
JasonSmith Avatar answered Sep 20 '22 09:09

JasonSmith