I have a hidden field on my page that stores space separated list of emails. I can have maximum 500 emails in that field.
What will be the fastest way to search if a given email already exists in that list? I need to search multiple emails in a loop
use RegEx to find a match
use indexOf()
If this is an exact duplicate, please let me know the other question. Thanks
EDIT: Thanks everyone for your valuable comments and answers. Basically my user has a list of emails(0-500) in db. User is presented with his own contact list. User can then choose one\more emails from his contact list to add to the list. I want to ensure at client side that he is not adding duplicate emails. Whole operation is driven by ajax, so jsvascript is required.
A regular expression (also called regex for short) is a fast way to work with strings of text. By formulating a regular expression with a special syntax, you can: search for text in a string.
Using test() The test() method is a RegExp expression method. It searches a string for a pattern, and returns true or false, depending on the result.
JavaScript Code: function search_word(text, word){ var x = 0, y=0; for (i=0;i< text. length;i++) { if(text[i] == word[0]) { for(j=i;j< i+word. length;j++) { if(text[j]==word[j-i]) { y++; } if (y==word. length){ x++; } } y=0; } } return "'"+word+"' was found "+x+" times.
JavaScript String indexOf() Method.
The answer is: It depends.
...but provided you're looking for a lot fewer than you have in total, it's probably String#indexOf
unless you can create the dictionary once and reuse it (not just this one loop of looking for X entries, but every loop looking for X entries, which I tend to doubt is your use-case), in which case that's hands-down faster to build the 500-key dictionary and use that.
I put together a test case on jsperf comparing the results of looking for five strings buried in a string containing 500 space-delimited, unique entries. Note that that jsperf page compares some apples and oranges (cases where we can ignore setup and what kind of setup we're ignoring), but jsperf was being a pain about splitting it and I decided to leave that as an exercise for the reader.
In my tests of what I actually think you're doing, Chrome, Firefox, IE6, IE7 and IE9 did String#indexOf
fastest. Opera did RegExp alternation
fastest. (Note that IE6 and IE7 don't have Array#indexOf
; the others do.) If you can ignore dictionary setup time, then using a dictionary is the hands-down winner.
Here's the prep code:
// ==== Main Setup
var toFind = ["aaaaa100@zzzzz", "aaaaa200@zzzzz", "aaaaa300@zzzzz", "aaaaa400@zzzzz", "aaaaa500@zzzzz"];
var theString = (function() {
var m, n;
m = [];
for (n = 1; n <= 500; ++n) {
m.push("aaaaa" + n + "@zzzzz");
}
return m.join(" ");
})();
// ==== String#indexOf (and RegExp) setup for when we can ignore setup
var preppedString = " " + theString + " ";
// ==== RegExp setup for test case ignoring RegExp setup time
var theRegExp = new RegExp(" (?:" + toFind.join("|") + ") ", "g");
// ==== Dictionary setup for test case ignoring Dictionary setup time
var theDictionary = (function() {
var dict = {};
var index;
var values = theString.split(" ");
for (index = 0; index < values.length; ++index) {
dict[values[index]] = true;
}
return dict;
})();
// ==== Array setup time for test cases where we ignore array setup time
var theArray = theString.split(" ");
The String#indexOf
test:
var index;
for (index = 0; index < toFind.length; ++index) {
if (theString.indexOf(toFind[index]) < 0) {
throw "Error";
}
}
The String#indexOf
(ignore setup) test, in which we ignore the (small) overhead of putting spaces at either end of the big string:
var index;
for (index = 0; index < toFind.length; ++index) {
if (preppedString.indexOf(toFind[index]) < 0) {
throw "Error";
}
}
The RegExp
alternation test:
// Note: In real life, you'd have to escape the values from toFind
// to make sure they didn't have special regexp chars in them
var regexp = new RegExp(" (?:" + toFind.join("|") + ") ", "g");
var match, counter = 0;
var str = " " + theString + " ";
for (match = regexp.exec(str); match; match = regexp.exec(str)) {
++counter;
}
if (counter != 5) {
throw "Error";
}
The RegExp
alternation (ignore setup) test, where we ignore the time it takes to set up the RegExp object and putting spaces at either end of the big string (I don't think this applies to your situation, the addresses you're looking for would be static):
var match, counter = 0;
for (match = theRegExp.exec(preppedString); match; match = theRegExp.exec(preppedString)) {
++counter;
}
if (counter != 5) {
throw "Error";
}
The Dictionary test:
var dict = {};
var index;
var values = theString.split(" ");
for (index = 0; index < values.length; ++index) {
dict[values[index]] = true;
}
for (index = 0; index < toFind.length; ++index) {
if (!(toFind[index] in dict)) {
throw "Error";
}
}
The Dictionary (ignore setup) test, where we don't worry about the setup time for the dictionary; note that this is different than the RegExp
alternation (ignore setup) test because it assumes the overall list is invariant:
var index;
for (index = 0; index < toFind.length; ++index) {
if (!(toFind[index] in theDictionary)) {
throw "Error";
}
}
The Array#indexOf
test (note that some very old implementations of JavaScript may not have Array#indexOf
):
var values = theString.split(" ");
var index;
for (index = 0; index < toFind.length; ++index) {
if (values.indexOf(toFind[index]) < 0) {
throw "Error";
}
}
The Array#indexOf
(ignore setup) test, which like Dictionary (ignore setup) assumes the overall list is invariant:
var index;
for (index = 0; index < toFind.length; ++index) {
if (theArray.indexOf(toFind[index]) < 0) {
throw "Error";
}
}
Instead of looking for the fastest solution, you first need to make sure that you’re actually having a correct solution. Because there are four cases an e-mail address can appear and a naive search can fail:
[email protected]
[email protected] ...
... [email protected]
... [email protected] ...
Now let’s analyze each variant:
To allow arbitrary input, you will need to escape the input properly. You can use the following method to do so:
RegExp.quote = function(str) {
return str.toString().replace(/(?=[.?*+^$[\]\\(){}-])/g, "\\");
};
To match all four cases, you can use the following pattern:
/(?:^|\ )user@example\.com(?![^\ ])/
Thus:
var inList = new RegExp("(?:^| )" + RegExp.quote(needle) + "(?![^ ])").test(haystack);
Using indexOf
is a little more complex as you need to check the boundaries manually:
var pos = haystack.indexOf(needle);
if (pos != -1 && (pos != 0 && haystack.charAt(pos-1) !== " " || haystack.length < (pos+needle.length) && haystack.charAt(pos+needle.length) !== " ")) {
pos = -1;
}
var inList = pos != -1;
This one is rather quite simple:
var dict = {};
haystack.match(/[^\ ]+/g).map(function(match) { dict[match] = true; });
var inList = dict.hasOwnProperty(haystack);
Now to test what variant is the fastest, you can do that at jsPerf.
indexOf()
is most probably the fastest just keep in mind you need to search for two possible cases:
var existingEmails = "email1, email2, ...";
var newEmail = "[email protected]";
var exists = (existingEmails.indexOf(newEmail + " ") >= 0) || (existingEmails.indexOf(" " + newEmail ) > 0);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With