Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

optimize search through large js string array?

if I have a large javascript string array that has over 10,000 elements, how do I quickly search through it?

Right now I have a javascript string array that stores the description of a job, and I"m allowing the user to dynamic filter the returned list as they type into an input box.

So say I have an string array like so:
var descArr = {"flipping burgers", "pumping gas", "delivering mail"};

and the user wants to search for: "p"

How would I be able to search a string array that has 10000+ descriptions in it quickly? Obviously I can't sort the description array since they're descriptions, so binary search is out. And since the user can search by "p" or "pi" or any combination of letters, this partial search means that I can't use associative arrays (i.e. searchDescArray["pumping gas"] ) to speed up the search.

Any ideas anyone?

like image 644
TriFu Avatar asked Oct 20 '10 08:10

TriFu


2 Answers

As regular expression engines in actual browsers are going nuts in terms of speed, how about doing it that way? Instead of an array pass a gigantic string and separate the words with an identifer. Example:

  • String "flipping burgers""pumping gas""delivering mail"
  • Regex: "([^"]*ping[^"]*)"

With the switch /g for global you get all the matches. Make sure the user does not search for your string separator.

You can even add an id into the string with something like:

  • String "11 flipping burgers""12 pumping gas""13 delivering mail"
  • Regex: "(\d+) ([^"]*ping[^"]*)"

  • Example: http://jsfiddle.net/RnabN/4/ (30000 strings, limit results to 100)

like image 131
sod Avatar answered Sep 29 '22 18:09

sod


There's no way to speed up an initial array lookup without making some changes. You can speed up consequtive lookups by caching results and mapping them to patterns dynamically.

1.) Adjust your data format. This makes initial lookups somewhat speedier. Basically, you precache.

var data = {
    a : ['Ant farm', 'Ant massage parlor'],
    b : ['Bat farm', 'Bat massage parlor']
    // etc
}

2.) Setup cache mechanics.

var searchFor = function(str, list, caseSensitive, reduce){
    str = str.replace(/(?:^\s*|\s*$)/g, ''); // trim whitespace
    var found = [];
    var reg = new RegExp('^\\s?'+str, 'g' + caseSensitive ? '':'i');
    var i = list.length;
    while(i--){
        if(reg.test(list[i])) found.push(list[i]);
        reduce && list.splice(i, 1);
    }
}

var lookUp = function(str, caseSensitive){
    str = str.replace(/(?:^\s*|\s*$)/g, ''); // trim whitespace
    if(data[str]) return cache[str];
    var firstChar = caseSensitive ? str[0] : str[0].toLowerCase();
    var list = data[firstChar];
    if(!list) return (data[str] = []);
    // we cache on data since it's already a caching object.
    return (data[str] = searchFor(str, list, caseSensitive)); 
}

3.) Use the following script to create a precache object. I suggest you run this once and use JSON.stringify to create a static cache object. (or do this on the backend)

// we need lookUp function from above, this might take a while
var preCache = function(arr){
    var chars = "abcdefghijklmnopqrstuvwxyz".split('');
    var cache = {};
    var i = chars.length;
    while(i--){
        // reduce is true, so we're destroying the original list here.
        cache[chars[i]] = searchFor(chars[i], arr, false, true);
    }
    return cache;
}

Probably a bit more code then you expected, but optimalisation and performance doesn't come for free.

like image 33
BGerrissen Avatar answered Sep 29 '22 17:09

BGerrissen