When using Regex.IsMatch
(C#, .Net 4.5) on a specific string, the CPU reaches 100%.
String:
https://www.facebook.com/CashKingPirates/photos/a.197028616990372.62904.196982426994991/1186500984709792/?type=1&permPage=1
Pattern:
^http(s)?://([\w-]+.)+[\w-]+(/[\w- ./?%&=])?$
Full code:
Regex.IsMatch("https://www.facebook.com/CashKingPirates/photos/a.197028616990372.62904.196982426994991/1186500984709792/?type=1&permPage=1",
@"^http(s)?://([\w-]+.)+[\w-]+(/[\w- ./?%&=])?$");
I found that redacting the URL prevents this problem. Redacted URL:
https://www.facebook.com/CashKingPirates/photos/a.197028616990372.62904.196982426994991/1186500984709792
But still very interested in understanding what causes this.
IsMatch(String, String, RegexOptions, TimeSpan) Indicates whether the specified regular expression finds a match in the specified input string, using the specified matching options and time-out interval.
Contains will almost certainly be faster and use less memory. And in of course, there's no reason to use regex here. Save this answer.
IndexOf is probably faster than regex in most cases. Especially if you don't use a precompiled regex. It's performance might also depend on the chosen string comparison/culture.
Contains() Method. In C#, String. Contains() is a string method. This method is used to check whether the substring occurs within a given string or not.
As nu11p01n73R pointed out, you have a lot backtracking with your regular expression. That’s because parts of your expression can all match the same thing, which gives the engine many choices it has to try before finding a result.
You can avoid this by changing the regular expression to make individual sections more specific. In your case, the cause is that you wanted to match a real dot but used the match-all character .
instead. You should escape that to \.
.
This should already reduce the backtracking need a lot and make it fast:
^http(s)?://([\w-]+\.)+[\w-]+(/[\w- ./?%&=])?$
And if you want to actually match the original string, you need to add a quantifier to the character class at the end:
^http(s)?://([\w-]+\.)+[\w-]+(/[\w- ./?%&=]+)?$
↑
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With