Using map reduce in CouchDB to output fewer rows

Tags:

Lets say you have two document types, customers and orders. A customer document contains basic information like name, address etc. and orders contain all the order information each time a customer orders something. When storing the documents, the type = order or the type = customer.

If I do a map function over a set of 10 customers and 30 orders it will output 40 rows. Some rows will be customers, some will be orders.

The question is, how do I write the reduce, so that the order information is "stuffed" inside of the rows that has the customer information? So it will return 10 rows (10 customers), but all the relevant orders for each customer.

Basically I don't want separate records on the output, I want to combine them (orders into one customer row) and I think reduce is the way?

708

asked May 19 '11 21:05

Matt

1 Answers

This is called view collation and it is a very useful CouchDB technique.

Fortunately, you don't even need a reduce step. Just use map to get the customers and their orders "clumped" together.

Setup

The key is that you need a unique id for each customer, and it has to be known both from customer docs and from order docs.

Example customer:

{ "_id": "customer [email protected]"
, "type": "customer"
, "name": "Jason"
}

Example order:

{ "_id": "abcdef123456"
, "type": "order"
, "for_customer": "customer [email protected]"
}

I have conveniently used the customer ID as the document _id but the important thing is that both docs know the customer's identity.

Payoff

The goal is a map query, where if you specify ?key="customer [email protected]" then you will get back (1) first, the customer info, and (2) any and all orders placed.

This map function would do that:

function(doc) {
  var CUSTOMER_VAL = 1;
  var ORDER_VAL    = 2;
  var key;

  if(doc.type === "customer") {
    key = [doc._id, CUSTOMER_VAL];
    emit(key, doc);
  }

  if(doc.type === "order") {
    key = [doc.for_customer, ORDER_VAL];
    emit(key, doc);
  }
}

All rows will sort primarily on the customer the document is about, and the "tiebreaker" sort is either the integer 1 or 2. That makes customer docs always sort above their corresponding order docs.

["customer [email protected]", 1], ...customer doc...
["customer [email protected]", 2], ...customer's order...
["customer [email protected]", 2], ...customer's other order.
... etc...
["customer [email protected]", 1], ... different customer...
["customer [email protected]", 2], ... different customer's order

P.S. If you follow all that: instead of 1 and 2 a better value might be null for the customer, then the order timestamp for the order. They will sort identically as before except now you have a chronological list of orders.

153

answered Sep 28 '22 09:09

JasonSmith

Related questions
                            
                                Get highest ids in by an inner join and max id
                            
                                Designing Database Schema for Event-based Analytics
                            
                                Table Details in SQL Anywhere?
                            
                                Is database normalization still necessary?
                            
                                how B-tree indexing works in mysql
                            
                                What's the point of adding NOT NULL to primary key field in MySQL?
                            
                                Deleting every nth row SQL
                            
                                How do I fix "incorrect syntax near GO" errors in scripts generated with Microsoft's Database Publishing Wizard?
                            
                                In SQLAlchemy, how do I create a 'MySQL date' column?
                            
                                How do you clone ( duplicate ) a MongoDB object in a collection of the same db?
                            
                                Best to use * when calling a lot of fields in mysql? [duplicate]
                            
                                Insert manually into a table by SQL statement, but key is autoincremented
                            
                                Inserting a Zero instead of NULL while performing LEFT OUTER JOIN in MYSQL
                            
                                Storing data of rich text box to database with formatting
                            
                                What is a Bw-tree?
                            
                                Which database engine is best for node.js apps? [closed]
                            
                                how to get connected clients in MongoDB
                            
                                CREATE OR REPLACE VIEW sql error
                            
                                How to use a persistent H2 database in the Play Framework instead of in-memory
                            
                                Keeping an application database agnostic (ADO.NET vs encapsulating DB logic)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Using map reduce in CouchDB to output fewer rows

Tags:

database

join

map

couchdb

reduce