Under Python:
ttsiod@elrond:~$ python
>>> import re
>>> a='This is a test'
>>> re.sub(r'(.*)', 'George', a)
'George'
Under Perl:
ttsiod@elrond:~$ perl
$a="This is a test";
$a=~s/(.*)/George/;
print $a;
(Ctrl-D)
George
Under C#:
using System;
using System.Collections.Generic;
using System.Text;
using System.Threading;
using System.Text.RegularExpressions;
namespace IsThisACsharpBug
{
class Program
{
static void Main(string[] args)
{
var matchPattern = "(.*)";
var replacePattern = "George";
var newValue = Regex.Replace("This is nice", matchPattern, replacePattern);
Console.WriteLine(newValue);
}
}
}
Unfortunately, C# prints:
$ csc regexp.cs
Microsoft (R) Visual C# 2008 Compiler version 3.5.30729.5420
for Microsoft (R) .NET Framework version 3.5
Copyright (C) Microsoft Corporation. All rights reserved.
$ ./regexp.exe
GeorgeGeorge
Is this a bug in the regular expression library of C# ? Why does it print "George" two times, when Perl and Python just print it once?
In your example the difference seems to be in the semantics of the 'replace' function rather than in the regular expression processing itself.
.net is doing a "global" replace, i.e. it is replacing all matches rather than just the first match.
Global Replace in Perl
(notice the small 'g' at the end of the =~s line)
$a="This is a test";
$a=~s/(.*)/George/g;
print $a;
which produces
GeorgeGeorge
Single Replace in .NET
var re = new Regex("(.*)");
var replacePattern = "George";
var newValue = re.Replace("This is nice", replacePattern, 1) ;
Console.WriteLine(newValue);
which produces
George
since it stops after the first replacement.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With