Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Multiple key ranges as parameters to a CouchDB view

Tags:

couchdb

The underlying problem - let's say my documents have "categories" and timestamps. If I want all documents in the "foo" category that have a timestamp that's within the last two hours, it's simple:

function (doc) {
  emit([doc.category, doc.timestamp], null);
}

and then query as

GET server:5894/.../myview?startKey=[foo, |now - 2 hours|]&endkey=[foo, |now|]

the problem comes when I want something in categories foo or bar, within the last two hours. If I didn't care about time, I could just pull directly by key through the keys collection. unfortunately, I have no such option with ranges.

What I ended up doing in the meantime is rounding the timestamp to two-hour blocks, and then multiplexing the query out:

POST server:5894/.../myview
keys=[[foo, 0 hours], [foo, 2 hours], [bar, 0 hours], [bar, 2 hours]]

It works, but will get messy if I want to go back a large amount of time (in relationship to the blocksize).

Is there a way to send multiple startKey/endKey pairs to a view, akin to the keys: [] array that can be posted for keys?

like image 863
kolosy Avatar asked Sep 23 '09 21:09

kolosy


3 Answers

There is a CouchDB issue request to let you do just that. I've attached a simple, no guarantees patch to 0.10.1 to that ticket which may work for you. It works for me and lets me do things like:

{
    "keys": [
        {
            "startkey": ["0240286524","2010","03","01"],
            "endkey": ["0240286524","2010","03","07",{}]
        },
        {
            "startkey": ["0442257276","2010","03","01"],
            "endkey": ["0442257276","2010","03","07",{}]
        }
    ]
}

in the POST body, which lets me get all the data across multiple tracking ids, for a range of dates. I call with group=true&group_level=1 to have the results grouped by tracking id. Deeper group levels would allow me to group by tracking id|year, tracking id|year|month etc.

Multiple connections were an unscalable overhead for me as I'd be looking to make 2000 concurrently :) (No, a new view is not an option - we're already at 400GB for data plus one view!)

The issue and patch is at https://issues.apache.org/jira/browse/COUCHDB-523 .

like image 124
majelbstoat Avatar answered Oct 27 '22 11:10

majelbstoat


Your probably better off just doing two queries. CouchDB can handle multiple simultaneous queries pretty well so spin off several processes/threads and query for foo and bar docs seperately.

CouchDB does not currently support multiple range queries. ORing and ANDing keys is pretty much not doable in one query.

like image 35
Jeremy Wall Avatar answered Oct 27 '22 10:10

Jeremy Wall


This has been added in newer versions of CouchDB. To add multiple ranges of start/end keys, you can use a POST request to your view, with a body that looks something like this:

{
  "queries": [
    { "startkey": 10, "endkey": 11 },
    { "startkey": 16, "endkey": 18 }
  ]
}

I know it's an old question but I initially found it when I was looking for exactly this!

like image 44
Lorna Mitchell Avatar answered Oct 27 '22 10:10

Lorna Mitchell