While answering this question C# Regex Replace and * the point was raised as to why the problem exists. When playing I produced the following code:
string s = Regex.Replace(".A.", "\w*", "B");
Console.Write(s);
This has the output: B.BB.B
I get that the 0 length string is match before and after the .
character, but why is A replaced by 2 Bs.
I could understand B.BBB.B
as replacing zero-length strings either side of A
or B.B.B
But the actual result confuses me - any help appreciated.
Or as AakashM has put it:
Why is Regex.Matches("A", "\w*").Count
equal to 2
, not 1
or 3
?
There is a star after \w
It means "zero or many" so that means:
Expression \w{0,}
will have the same effect.
If you want to avoid it, use 'plus' which means 'at least one': \w+
Thats the same behaviour than
Regex.Replace("", "\w*", "B")
results in BRegex.Replace("A", "\w*", "B")
results in BB
See it here on Regexr
For the string ".A." \w*
matches before the first dot the empty string, then on the "A", after the "A" the empty string and after the last dot the empty string.
Explanation
You can think of the pattern eating the characters, \w*
has eaten the "A", the next char is a dot, so this match is complete and replaced. But the start position for the pattern to continue matching is still between the A and the dot. The dot can not be matched, so it matches the empty string before the dot, but then this position is done and the next start position is after the dot.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With