I have two strings and I need to know whether they are equal.
I have previously done this: str1 === str2 , but I wonder if there is a faster way to compare two strings.
The strings are fairly short being 15-25 characters long. My problem is that I am iterating through a lot of strings and it is taking quite a long time.
I have a lot of comparisons in a structure like this:
If(str === str1)
{
do something
}
else if(str === str2)
{
do something
}
else if(str === str3)
{
do something
}
The strings do not have any common structure or grouping.
String equality is slightly faster at 13.15ms vs BigInt at 16.57ms for 100k comparisons.
The right way of comparing String in Java is to either use equals(), equalsIgnoreCase(), or compareTo() method. You should use equals() method to check if two String contains exactly same characters in same order. It returns true if two String are equal or false if unequal.
Explanation of the example: In the above example, when str1 and str2 are compared after using the toUpperCase method, the expression JAVASCRIPT == JAVASCRIPT would return true as they are the same strings after both being in upper case. Here, the equality operator (==) is used to check if both the strings are the same.
Comparing strings with a === b
is the fastest way to compare string natives.
However, if you could create String Objects like new String("test")
, re-use those and use those in the comparisons, that would be even faster, because the JS engine would only need to do a pointer-comparison, which is (a small amount) faster than string comparisons.
See http://jsperf.com/string-vs-object-comparisons
If your "do somethings" share a similar form with different values, you can put the values into a map and use the string as a key. For example's sake, imagine you have to process many numbers with different units of length and you want to convert them all to meters:
var conversionToMeters = {
"inch": 0.0254,
"inches": 0.0254,
"foot": 0.3048,
"feet": 0.3048,
"cubit": 0.4572,
"cubits": 0.4572,
"yard": 0.9144,
"yards": 0.9144,
"kilometer": 1000,
"kilometers": 1000,
"mile": 1609.344,
"miles": 1609.344,
"lightyear": 9.46e15,
"lightyears": 9.46e15,
"parsec": 3.09e16,
"parsecs": 3.09e16,
}
(Abbreviations (like "km") and international spellings (like "kilometres") omitted for brevity.) You can prepare that map ahead of time to avoid creation overhead. Now, given a variable length
such as length = "80 miles"
, you can do:
var magnitude = length.replace(/[\D]/g, "");
var unit = length.replace(/[\d\s]/g, "");
var lengthInMeters = magnitude * conversionToMeters[unit];
alert(lengthInMeters + " meters"); // Ta-da!
If your "do somethings" do not share common code you can still use a map, but it will be a map of functions:
var actions = {
"eat": function() {
if (spareFood > 0) {
spareFood--;
energy += 10;
health++;
alert("Yum!");
}
},
"walk": function() {
if (energy > 0) energy--;
// ...
},
"attack": function() {
if (energy > 0) {
if (Math.random() < 0.25) {
health--;
alert("Ouch!");
}
energy--;
}
},
// ...
};
This is a bit of a silly example but I hope it explains the basic idea. The actions could equally be XML tags, or names of CPU instructions in a virtual machine, or names of products that have special shipping requirements, or whatever. Once you've got your action
variable, executing it is as simple as:
actions[action]();
A map isn't the only way to do this kind of thing. Your original if/else example can be optimized easily by nesting the ifs inside additional ifs that are designed to quickly eliminate most of the candidate strings.
The criteria you branch on will depend on the exact strings you're working with. It could be the length of the string, or the first letter, or a couple of the most distinguishing letters:
if (str.length === 3) {
// test all length 3 strings here
if (str === strA) doSomething();
else if (str == strB) doSomething();
} else if (str.length === 4) {
// test all length 4 strings here
if (str === strC) doSomething();
else if (str === strD) doSomething();
}
Or:
var first = str[0]; // first character
if (first >= "0" && first <= "9") {
// test all strings that start with digits here
if (first >= "a" && first <= "l") {
// test all strings that start with letters
// in the first half of the alphabet here
} else if (first >= "m" && first <= "z") {
// test all strings that start with letters
// in the latter half of the alphabet here
}
You can nest these kind of tests inside one another to whatever degree is appropriate to sift through the particular strings you're working with. This is a sort of unrolled binary search, although the criteria you branch on do not have to divide the candidate strings into exactly two groups.
Also, when you use an if/elseif like this, it's often worth arranging the strings in descending order of frequency. I.e., test the ones that happen most, first. If there are just a couple of strings that make up the majority of the data, pull them to the top, and even put them outside of any pre-tests based on length or first letter.
You'll have to decide whether it's worth doing these things: if you take these techniques to the extreme, you might be able to squeeze tiny additional performance benefits, but it will sacrifice readability and maintainability.
P.S. I don't know JavaScript well enough to know exactly how these techniques will perform but I've done similar things in Java. In Java the map approach is unbeatable when the "do somethings" require different values but can use the same code. In a different program, I needed to switch
on an integer value performing about 400 dissimilar actions (it was awful). The HotSpot Client VM has a lousy inefficient implementation of the switch
statement that is simply a lot of elseifs, and it was too slow. An array of functions (which technically were objects with overridden virtual methods) was faster, but the function call overhead was too great compared to the simplicity of each action. In this case I found a mixed binary-quaternary search to be effective. What that means is: the outer tests were if/elses that divided the input values evenly into two groups. These were nested until there were only four possible values left in the inner groups. Then I used an if/elseif/elseif/else to distinguish among the remaining four values. Since this was so long, I wrote some code to write it for me, but it was still worth the effort for this particular application.
P.P.S. There's an approach I skipped above but I'll include it for completeness: if your strings will rarely need changing, you can use a perfect hash function. There are utility programs that design these functions for you: just supply them with a list of all your strings. A perfect hash function will calculate an integer hashcode from a string, and guarantee that no two strings from your set have the same hashcode. Then you can use the integer hashcode for lookup of the action in an array. It's helpful for things like parsing keywords of programming languages. It can be faster in a language that is closer to the metal, but in JavaScript I suspect it will not be worth it. I'm mentioning it just in case.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With