Is there a theorical expression size limit for "or" operator on Regex.Replace such as Regex.Replace("abc","(a|c|d|e...continue say 500000 elements here)","zzz") ?
Any stackoverflowException on .NET's implementation ?
Thanks
How to use RegEx with . replace in JavaScript. To use RegEx, the first argument of replace will be replaced with regex syntax, for example /regex/ . This syntax serves as a pattern where any parts of the string that match it will be replaced with the new substring.
Alternation is the term in regular expression that is actually a simple “OR”. In a regular expression it is denoted with a vertical line character | . For instance, we need to find programming languages: HTML, PHP, Java or JavaScript.
By combining the interval quantifier with the surrounding start- and end-of-string anchors, the regex will fail to match if the subject text's length falls outside the desired range.
No. There are many dialects of regular expressions, though the basic expressions tend to be quite similar.
There is no theoretical limit, though each regular expression engine will have its own implementation limits. In this case, since you are using .NET the limit is due to the amount of memory the .NET runtime can use.
A regular expression with one million alernations works fine for me:
string input = "a<142>c";
var options = Enumerable.Range(0, 1000000).Select(x => "<" + x + ">");
string pattern = string.Join("|", options);
string result = Regex.Replace(input, pattern, "zzz");
Result:
azzzc
It's very slow though. Increasing the number of options to 10 million gives me an OutOfMemoryException
.
You probably would benefit from looking at another approach.
The way regular expressions work mean that the memory requirements and performance for a simple a|b|c.....|x|y|z
expression as described are not too bad, even for a very large number of variants.
However, if your expression is even slightly more complex than that, you could cause the expression to lose performance exponentially, as well as massively growing its memory footprint, as an large number of or
options like this can cause it to have to do massive amounts of backtracking if other parts of the expression don't match immediately.
You may therefore want to excersise caution either doing this sort of thing. Even if it works now, it would only take a small and relatively innocent change to make the whole thing come to a grinding halt.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With