Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

.NET Regex Replace Single Line Matching Unknown Character

This has me extremely baffled. Why am I getting duplicate replace strings in the following code:

static void Main(string[] args)
{
    String input = "test";
    String pattern = ".*";
    String replacement = "replace";
    Console.WriteLine(Regex.Replace(input, pattern, replacement));
    Console.Read();
}

This outputs to the console:

replacereplace

I understand that regex gets weird matching end line characters but there should be none. I also understand that the pattern can match nothing, but clearly the input is not nothing. This happens in .Net 3.5 and 4.0 and I get the same thing with SingleLine and MultiLine.

I know there are several alternatives that will do what I'm expecting but I'm wondering more about what other match .* thinks its finding.

like image 907
Joshua Belden Avatar asked Oct 06 '11 23:10

Joshua Belden


Video Answer


2 Answers

The reason you get two replacements is because with .* you get two matches: "test", and "".

If you change .* to .+ it will work the way you expect it to:

String pattern = ".+";

Another option is to add the start of string anchor:

String pattern = "^.*"; // I know this looks like a smiley
like image 134
NullUserException Avatar answered Nov 05 '22 09:11

NullUserException


It matches nothing and then it matches everything therefore you have two matches and two replaces.

like image 25
FailedDev Avatar answered Nov 05 '22 10:11

FailedDev