Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the maximum value for a compound CouchDB key?

I'm using what seems to be a common trick for creating a join view:

// a Customer has many Orders; show them together in one view:
function(doc) {
  if (doc.Type == "customer") {
    emit([doc._id, 0], doc);
  } else if (doc.Type == "order") {
    emit([doc.customer_id, 1], doc);
  }
}

I know I can use the following query to get a single customer and all related Orders:

?startkey=["some_customer_id"]&endkey=["some_customer_id", 2]

But now I've tied my query very closely to my view code. Is there a value I can put where I put my "2" to more clearly say, "I want everything tied to this Customer"? I think I've seen

?startkey=["some_customer_id"]&endkey=["some_customer_id", {}]

But I'm not sure that {} is certain to sort after everything else.

Credit to cmlenz for the join method.

Further clarification from the CouchDB wiki page on collation:

The query startkey=["foo"]&endkey=["foo",{}] will match most array keys with "foo" in the first element, such as ["foo","bar"] and ["foo",["bar","baz"]]. However it will not match ["foo",{"an":"object"}]

So {} is late in the sort order, but definitely not last.

like image 686
James A. Rosen Avatar asked Jul 24 '09 22:07

James A. Rosen


2 Answers

I have two thoughts.

Use timestamps

Instead of using simple 0 and 1 for their collation behavior, use a timestamp that the record was created (assuming they are part of the records) a la [doc._id, doc.created_at]. Then you could query your view with a startkey of some sufficiently early date (epoch would probably work), and an endkey of "now", eg date +%s. That key range should always include everything, and it has the added benefit of collating by date, which is probably what you want anyways.

or, just don't worry about it

You could just index by the customer_id and nothing more. This would have the nice advantage of being able to query using just key=<customer_id>. Sure, the records won't be collated when they come back, but is that an issue for your application? Unless you are expecting tons of records back, it would likely be trivial to simply pluck the customer record out of the list once you have the data retrieved by your application.

For example in ruby:

customer_records = records.delete_if { |record| record.type == "customer" }

Anyways, the timestamps is probably the more attractive answer for your case.

like image 186
Jim Garvin Avatar answered Oct 23 '22 10:10

Jim Garvin


Rather than trying to find the greatest possible value for the second element in your array key, I would suggest instead trying to find the least possible value greater than the first: ?startkey=["some_customer_id"]&endkey=["some_customer_id\u0000"]&inclusive_end=false.

like image 33
user359996 Avatar answered Oct 23 '22 11:10

user359996