I would like to use CouchDB to store some data for me and then use RESTful api calls to get the data that I need. My database is called "test" and my documents all have a similar structure and look something like this (where hello_world is the document ID):
"hello_world" : {"id":123, "tags":["hello", "world"], "text":"Hello World"}
"foo_bar" :{"id":124, "tags":["foo", "bar"], "text":"Foo Bar"}
What I'd like to be able to do is have my users send a query such as: "Give me all the documents that contain the words 'hello world', for example. I've been playing around with views but it looks like they will only allow me to move one or more of those values into the "key" portion of the map function. That gives me the ability to do something like this:
http://localhost:5984/test/_design/search/_view/search_view?key="hello"
But this doesn't allow me to let my users specify their query string. For example, what if they searched for "hello world". I'd have to do two queries: one for "hello" and one for "world" then I'd have to write a bunch of javascript to combine the results, remove duplicates, etc (YUCK!). What I really want is to be able to do something like this:
http://localhost:5984/test/_design/search/_view/search_view?term="hello world"
Then use the parameter "hello world" in the views map/reduce functions to find all the documents that contain both "hello" and "world" in the tags array. Is this sort of thing even possible with CouchDB? Is there another way to accomplish this inside a view that I'm not thinking of?
CouchDB Views do not support facetted search or fulltext search or result intersection. The couchdb-lucene plugin lets you do all these things.
http://github.com/rnewson/couchdb-lucene/tree/master
Technically this is possible if you emit for each document each set of the powerset of the tags of the document as the key. The key set element must be ordered and your query whould have to query the tags ordered, too.
function map(doc) {
function powerset(array) { ... }
powerset_of_tags = powerset(doc.tags)
for(i in powerset_of_tags) {
emit(powerset_of_tags[i], doc);
}
}
for the doc {"hello_world" : {"id":123, "tags":["hello", "world"], "text":"Hello World"}
this would emit:
{ key: [], doc: ... }
{ key: ['hello'], doc: ... }
{ key: ['world'], doc: ... }
{ key: ['hello', 'world'], doc: ... }
Although is this possible I would consider this a rather arkward solution. I don't want to imagine the disk usage of the view for a larger number of tags. I expect the number of emitted keys to grow like 2^n.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With