I have an app running which looks at items in a queue, then based upon certain keywords a category is applied - then it is inserted into a database.
I'm using IndexOf to determine if a certain keyword is present.
Is this the ideal way or would a RegEX be faster?
There's about 10 items per second being processed or so.
IndexOf is only useful for checking the existence of an exact substring, but Regex is much more powerful and allows you to do so much more.
The reason the regex is so slow is that the "*" quantifier is greedy by default, and so the first ". *" tries to match the whole string, and after that begins to backtrack character by character. The runtime is exponential in the count of numbers on a line.
NET 4.0 - IndexOf no longer uses Ordinal Comparison and so Contains can be faster.
Regex is faster for large string than an if (perhaps in a for loops) to check if anything matches your requirement. If you are using regex as to match very small text and small pattern and don't do it because the matcher function .
It seems correct that regex is faster in longer strings. My example: a 364kB file content is searched for the string "<product ". The starting point is moved to find the next and the next and so on. However, the searched string is not found in the entire value.
I used three test commands:
i = value.IndexOf("<" & tag & " ", xstart)
i = value.IndexOf("<" & tag & " ", xstart, StringComparison.Ordinal)
i = Regex.IsMatch(value.Substring(xstart), "<" & tag & " ", RegexOptions.Singleline)
Command one (indexof standard) needs ~ 7500 ms to search the string Command two (indexof with ordinal) needs ~ 300 ms ! command three (regex) needs ~ 650 ms (~1000ms with IgnoreCase option).
For just finding a keyword the IndexOf
method is faster than using a regular expression. Regular expressions are powerful, but their power lies in flexibility, not raw speed. They don't beat string methods at simple string operations.
Anyway, if the strings are not huge, it shouldn't really matter as you are not doing it so often.
http://ayende.com/blog/2930/regex-vs-string-indexof
It seems it may matter on the length of the string on efficiency.
The only way you know for sure is testing it. But making an educated guess it depends on the number of keywords your are testing, the length of the text, etc. The indexOf would probably win.
The only way you know for sure is write a test for your specific scenario.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With