Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MongoDB + Node.js + AJAX solution for doing autocomplete search

I'm looking to implement a typeahead/autocomplete search for fun. I have a few attributes within my schema in mongoDB, but I want to be able to search only by category, title, preview, or date.

This is my mongoDB schema for a single article (I'm using mongoose as ORM):

{
    title: { type: String, required: true}
    , preview: { type: String, required: true}
    , body: { type: String, required: true}
    , category: {type: String}
    , created_at: { type: Date, default: Date.now }
}

Each time I create, update, or destroy, I have to re-index so the search gets updated. The search will be autocompleted, such that for example, when I have two articles titled "Welcome to stackoverflow" and "How to avoid stackoverflow" respectively and the user types in a key 't' then I'd display both articles using AJAX since both have character 't' in their titles. i'd also like to highlight every single 't'; the 't' in 'to', 't' in s't'ackoverflow, indicating that the query hits something. (I expect that it will look similar to when we search for particular 'tags' here at stackoverflow.com)

The question now is should I use a different schema for indexing, or just stick to my existing schema? It seems that I wont be using 'body' attribute which contain the full article and has thousands of words in it since I'm not looking to do full text search right now.

  • Title attributes probably only have ~45 characters and 3 or 4 words in average.
  • Category mostly only 1 word with average 9-15 characters.
  • Preview would be the largest datasets with ~150 characters and 20 words in average.

I'd probably like to implement this using trie data structures. On top of my head, I would probably say that one way of doing this is by making AJAX request every keystroke that will be routed to node.js handler, and then from there making query to mongoDB that will return every entry that has words that has a letter that matches the keystroke typed in by the user as a JSON file. I will then parse that JSON file and display each entry.

The question then is how would I fit the trie algorithm into my plan? The other thing is that I need to rebuild the index each time i do CRUD operation.

Would appreciate any suggestion/pointers to the right direction or any articles that would help me do this. (I'm looking to do the best practice/performant way) Thanks. Let me know if the question needs to be clarified.

like image 291
Benny Tjia Avatar asked Jan 28 '12 00:01

Benny Tjia


1 Answers

I don't think a trie will work. Trie's usually operate from the beginning of a string. So if you used a trie to index your headlines, a user typing 't' would only be able to search a trie for headlines that started with t. I think the best bet for using mongodb, unless you've got huge amounts of text, is simply to use regular expressions in conjunction with the $or operator.

On the change event in a text input box, you'll want to make an AJAX request, as you said, to your node server which will issue the query to mongodb and return the results in a JSON array.

Regular expressions in mongo: http://www.mongodb.org/display/DOCS/Advanced+Queries#AdvancedQueries-RegularExpressions

$or operator: http://www.mongodb.org/display/DOCS/Advanced+Queries#AdvancedQueries-%24or

A demo of how jQuery UI handles auto complete (for reference on the AJAX request and filling in values): http://jqueryui.com/demos/autocomplete/

like image 195
btoconnor Avatar answered Nov 15 '22 17:11

btoconnor