Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is using a Regular Expression faster than IndexOf?

I have an app running which looks at items in a queue, then based upon certain keywords a category is applied - then it is inserted into a database.

I'm using IndexOf to determine if a certain keyword is present.

Is this the ideal way or would a RegEX be faster?

There's about 10 items per second being processed or so.

like image 631
Jack Marchetti Avatar asked Feb 21 '12 15:02

Jack Marchetti


People also ask

Is Javascript regex faster than IndexOf?

IndexOf is only useful for checking the existence of an exact substring, but Regex is much more powerful and allows you to do so much more.

Is regex fast or slow?

The reason the regex is so slow is that the "*" quantifier is greedy by default, and so the first ". *" tries to match the whole string, and after that begins to backtrack character by character. The runtime is exponential in the count of numbers on a line.

Which is faster IndexOf or contains?

NET 4.0 - IndexOf no longer uses Ordinal Comparison and so Contains can be faster.

Is regex faster in Java?

Regex is faster for large string than an if (perhaps in a for loops) to check if anything matches your requirement. If you are using regex as to match very small text and small pattern and don't do it because the matcher function .


4 Answers

It seems correct that regex is faster in longer strings. My example: a 364kB file content is searched for the string "<product ". The starting point is moved to find the next and the next and so on. However, the searched string is not found in the entire value.

I used three test commands:

         i = value.IndexOf("<" & tag & " ", xstart)

         i = value.IndexOf("<" & tag & " ", xstart, StringComparison.Ordinal)

         i = Regex.IsMatch(value.Substring(xstart), "<" & tag & " ", RegexOptions.Singleline)

Command one (indexof standard) needs ~ 7500 ms to search the string Command two (indexof with ordinal) needs ~ 300 ms ! command three (regex) needs ~ 650 ms (~1000ms with IgnoreCase option).

like image 65
Herbert Avatar answered Oct 10 '22 02:10

Herbert


For just finding a keyword the IndexOf method is faster than using a regular expression. Regular expressions are powerful, but their power lies in flexibility, not raw speed. They don't beat string methods at simple string operations.

Anyway, if the strings are not huge, it shouldn't really matter as you are not doing it so often.

like image 19
Guffa Avatar answered Oct 11 '22 15:10

Guffa


http://ayende.com/blog/2930/regex-vs-string-indexof

It seems it may matter on the length of the string on efficiency.

like image 14
David Welker Avatar answered Oct 11 '22 14:10

David Welker


The only way you know for sure is testing it. But making an educated guess it depends on the number of keywords your are testing, the length of the text, etc. The indexOf would probably win.

The only way you know for sure is write a test for your specific scenario.

like image 3
Peter Avatar answered Oct 11 '22 16:10

Peter