I have a CouchDB (v0.10.0) database that is 8.2 GB in size and contains 3890000 documents.
Now, I have the following as the Map of the view
function(doc) {emit([doc.Status], doc);
And it takes forever to load (4 hours and still no result).
Here's some extra information that might help describing the situation:
The view is not a temp view. The view is defined before the 3890000 documents are inserted.
There isn't anything on the server. It is a ubuntu box with nothing but the defaults installed.
I see that my CPU is moving and working hard (sometimes shoots to 100%). The memory is moving as well but not increasing.
So my question is:
But fundamentally, CouchDB chooses a "slower" protocol because it is so universal and so standard. Your documents are on the larger size for CouchDB. Most documents are single or double-digit KB, not triple. CouchDB is encoding/decoding that JSON in one big gulp (i.e. it is not streaming from the disk.)
Basically views are JavaScript codes which will be put in a document inside the database that they operate on. This special document is called Design document in CouchDB. Each Design document can implement multiple view. Please consult Official CouchDB Design Documents to learn more about how to write view.
In case you want to check the status of CouchDB, you can do so using the following command: sudo status couchdb.
In order to retrieve data with CouchDB, we use a process called MapReduce, to create views. A view contains rows of data that is sorted by the row's key (you might use date as a key, for example, to sort your data based on the date). MapReduce is a combination of two concepts Map and Reduce.
Don't emit the whole doc. It's unnecessary. You can instead run your query with include_docs=true
, which will let you access the document via each row's doc attribute.
When you emit the whole doc you make the index as large or larger than your entire database. :)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With