Lets say I have two types of documents stored in my CouchDB database. First is with property type set to contact and second to phone. Contact type document have another property called name. Phone type have properties number and contact_id so that it can reference to contact person. This is trivial one to many scenario where one contact can have N phone numbers (I know that they can be embedded in single contact document, but I need to demonstrate one to many relationship with different documents).
Raw example data with Scott having 2 phone numbers and Matt having 1 number:
{_id: "fc93f785e6bd8c44f14468828b001109", _rev: "1-fdc8d121351b0f5c6d7e288399c7a5b6", type: "phone", number: "123456", contact_id: "fc93f785e6bd8c44f14468828b00099f"}
{_id: "fc93f785e6bd8c44f14468828b000f6a", _rev: "1-b2dd90295693dc395019deec7cbf89c7", type: "phone", number: "465789", contact_id: "fc93f785e6bd8c44f14468828b00099f"}
{_id: "fc93f785e6bd8c44f14468828b00099f", _rev: "1-bd643a6b0e90c997a42d8c04c5c06af6", type: "contact", name: "Scott"}
{_id: "16309fcd03475b9a2924c61d690018e3", _rev: "1-723b7c999111b116c353a4fdab11ddc0", type: "contact", name: "Matt"}
{_id: "16309fcd03475b9a2924c61d69000aef", _rev: "3-67193f1bfa8ed21c68e3d35847e9060a", type: "phone", number: "789456", contact_id: "16309fcd03475b9a2924c61d690018e3"}
Map function:
function(doc) {
if (doc.type == "contact") {
emit([doc._id, 1], doc);
} else if (doc.type == "phone") {
emit([doc.contact_id, 0], doc);
}
}
Reduce function:
function(keys, values) {
var output = {};
for(var elem in values) {
if(values[elem].type == "contact") {
output = {
"ID": values[elem]._id,
"Name": values[elem].name,
"Type": values[elem].type,
"Phones": []
};
} else if (values[elem].type == "phone") {
output.Phones.push({
"Number": values[elem].number,
"Type": values[elem].type
});
}
}
return output;
}
group_level is set to 1 because of keys in Map function. Now I can get my contacts with included phones for example like this:
http://localhost:5984/testdb2/_design/testview/_view/tv1?group_level=1
Or search for some contact with startkey and endkey like this:
http://localhost:5984/testdb2/_design/testview/_view/tv1?group_level=1&startkey=[%22fc93f785e6bd8c44f14468828b00099f%22]&endkey=[%22fc93f785e6bd8c44f14468828b00099f%22,{}]
Results look exactly how I want - contacts will have embedded phones according to one to many relationship. And here goes the question: Is this the right way of how to use MapReduce functions in CouchDB? Are there any notable performance issues when using this approach?
Generally speaking you use less disk space if you do not emit(...,doc)
.
You may want to reconsider having a reduce function at all. It's really not necessary to get at the data you need. For example, something along the lines of the following may use less disk space and perform better if you have a huge number of records.
Also, I believe it is against the grain of CouchDB to build up more data in a reduce function than your documents contain. You're not doing that in this case but you are following a pattern that might lead you into trouble later. It's called reduce for a reason. :-)
So something like this is more the CouchDB way:
function(doc) {
if (doc.type == "contact") {
emit([doc._id, 0], {
"Name": doc.name,
"Type": doc.type
});
} else if (doc.type == "phone") {
emit([doc.contact_id, 1], {
"Number": doc.number,
"Type": doc.type
});
}
}
Query it for a particular contact like so:
http://localhost:5984/testdb2/_design/testview/_view/tv1? startkey=[%22fc93f785e6bd8c44f14468828b00099f%22, 0] &endkey=[%22fc93f785e6bd8c44f14468828b00099f%22,1]
Granted, you don't get results in the same JSON structure as before but I believe this performs better within CouchDB.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With