Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

JavaScript library for search engine style searching? [closed]

Tags:

Is there a JavaScript library that can determine if a string matches a search query? It should be efficient and provide advanced query functionality like that of Google or LexisNexis (things like and/or operators, synonyms, and parentheses). Any kind of advanced search features would be great; it doesn't have to be an exact match to any particular search engine.

Motivation: I have an HTML page with a search box followed by a bunch of paragraphs (which have unique ids and are generated from a JavaScript array). When the user types a search query in the box and presses enter, all paragraphs should be hidden (i.e. their display set to none) if they don't match the query.

My current strategy (using jQuery):

  1. Separate the query string into an array of keywords by splitting it over whitespace.
  2. Hide all paragraphs with $('p').hide().
  3. For each keyword, show a paragraph containing it with $('p:contains("'+keyword+'")').show().

Which is an extremely limited search feature that is case-sensitive, treats all keywords as optional, and doesn't provide operators like and, or, or parentheses. It's also inefficient because it goes through each string once for each keyword even if it has already been matched.

like image 524
Jordan Avatar asked Aug 06 '12 16:08

Jordan


2 Answers

Here are some libraries that I am evaluating for projects (in July 2013). Any of these should be able to provide the core of the search feature.

  • http://lunrjs.com/
    • stemming, scoring built in
    • 13.8 kb minified
    • updated recently (https://github.com/olivernn/lunr.js/commits/master)
    • 10 contributors
    • no external dependencies
  • http://fusejs.io (formerly at http://kiro.me/projects/fuse.html)
    • fuzzy search
    • 1.58 kb minified
    • updated recently (https://github.com/krisk/Fuse/commits/master)
    • 1 contributor
    • no external dependencies
  • http://reyesr.github.io/fullproof/
    • uses html5 storage with graceful degradation
    • 459 kb minified
    • last updated 2013 (https://github.com/reyesr/fullproof/commits/master)
    • 2 contributors
    • no external dependencies
  • http://eikes.github.io/facetedsearch/
    • pagination, templating built in
    • 5.70 kb minified
    • last updated 2014 (https://github.com/eikes/facetedsearch/commits/master)
    • 1 contributor
    • depends on jquery and underscore

If you feel like building your own, here are implementations of 2 common stemming algorithms to get you started:

  • https://github.com/fortnightlabs/snowball-js
  • http://tartarus.org/martin/PorterStemmer/

As for handling boolean logic search operators, maybe this question about js query parsers will be useful.

like image 110
turtlemonvh Avatar answered Oct 05 '22 03:10

turtlemonvh


The best (easy and good) way is to use a Vector Search Algorithm.

First take all words in each paragraph and save them in a vector object (how to build explained later) and compare relation to query Vector of each Paragraph Vector

Then on each word use the Porter stemmer to make it cluster things like kid and kids.

var Vector = function(phar) {  var self = this; self.InitVector = function () {     var wordArray = self.spltwords(phar);     self.VectorSize = wordArray .length;     var stemdWordArray = self.runPotterStemmer(wordArray);     self.VectoData = self.GroupAndCountWords(stemdWordArray) ; } self.VectoData = {};   self.runPotterStemmer = function(arr){     // run potter as seen in link }  self.spltwords= function(arr) {     // run split }  self.GroupAndCountWords = function(arr) {     for (var i=0; i<arr.length; i++) {         if (VectoData[arr[i]] === undefined) {             VectoData[arr[i]] = 0;              } else {             VectoData[arr[i]] = VectoData[arr[i]] +1;                 }     } }   self.compare = function(queryVector) {     // compare queryVector to current vector and return a similarity number     // number of similar words count in query divided by the length of paragraph                        }                         self.InitVector() return self; 
like image 25
yamsalm Avatar answered Oct 05 '22 04:10

yamsalm