Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the C# Regex equivalent to Java's appendReplacement and appendTail

Tags:

java

c#

regex

UPDATE

Here is what I came up with. I haven't tested it yet because it is part of a much larger piece of code that still needs to be ported.

Can you see anything that looks out of place?

private const string tempUserBlock = "%%%COMPRESS~USER{0}~{1}%%%";
string html = "some html";
int p = 0;
var userBlock = new ArrayList();

MatchCollection matcher = preservePatterns[p].Matches(html);
int index = 0;
StringBuilder sb = new StringBuilder();
int lastValue = 0;

foreach(Match match in matcher){
    string matchValue = match.Groups[0].Value;

    if(matchValue.Trim().Length > 0) {
        userBlock.Add(matchValue);

        int curIndex = lastValue + match.Index;
        sb.Append(html.Substring(lastValue, curIndex));
        sb.AppendFormat(tempUserBlock, p, index++);

        lastValue = curIndex + match.Length;
    }
}

sb.Append(html.Substring(lastValue));
html = sb.ToString();

ORIGINAL POST BELOW:

Here is the original Java:

private static final String tempUserBlock = "%%%COMPRESS~USER{0}~{1}%%%";
String html = "some html";
int p = 0;
List<String> userBlock = new ArrayList<String>();

Matcher matcher = patternToMatch.matcher(html);
int index = 0;
StringBuffer sb = new StringBuffer();
while (matcher.find())
{
    if (matcher.group(0).trim().length() > 0)
    {
        userBlock.add(matcher.group(0));
        matcher.appendReplacement(sb, MessageFormat.format(tempUserBlock, p, index++));
    }
}
matcher.appendTail(sb);
html = sb.toString();

And my C# conversion so far

private const string tempUserBlock = "%%%COMPRESS~USER{0}~{1}%%%";
string html = "some html";
int p = 0;
var userBlock = new ArrayList();

MatchCollection matcher = preservePattern.Matches(html);
int index = 0;
StringBuilder sb = new StringBuilder();

for(var i = 0; i < matcher.Count; ++i){
    string match = matcher[i].Groups[0].Value;
    if(match.Trim().Length > 0) {
        userBlock.Add(match);
        // WHAT DO I DO HERE?
        sb.Append( string.Format(tempUserBlock, p, index++) );            
    }
}
// WHAT DO I DO HERE?
matcher.appendTail(sb);
html = sb.toString();

See comment above, where I ask, "WHAT DO I DO HERE?"

Clarification
The Java code above is performing string replacement on some HTML. It saves the originally replaced text because it needs to be re-inserted later after some whitespace compression is completed.

like image 695
David Murdoch Avatar asked Sep 24 '10 17:09

David Murdoch


2 Answers

There's no need to reproduce Java's appendReplacement/appendTail functionality; .NET has something better: MatchEvaluator. Check it out:

string holder = "Element {0} = {1}";
string s0 = "111 222 XYZ";
ArrayList arr = new ArrayList();

string s1 = Regex.Replace(s0, @"\d+",
  m => string.Format(holder, arr.Add(m.Value), m.Value)
);

Console.WriteLine(s1);
foreach (string s in arr)
{
  Console.WriteLine(s);
}

output:

Element 0 = 111 Element 1 = 222 XYZ
111
222

There are several ways to implement the MatchEvaluator, all thoroughly discussed in this article. This one (lambda expressions) is by far the coolest.

like image 112
Alan Moore Avatar answered Sep 19 '22 23:09

Alan Moore


I'm not familiar with the Java regex classes, but this is my C# interpretation of what I think your code does:

private const string tempUserBlock = "%%%COMPRESS~USER{0}~{1}%%%"; 
string html = "some html"; 
int p = 0; 
var userBlock = new List<string>(); 

MatchCollection matcher = preservePattern.Matches(html); 
StringBuilder sb = new StringBuilder(); 
int last = 0;
foreach (Match m in matcher)
{
    string match = m.Groups[0].Value; 
    if(match.Trim().Length > 0) { 
        userBlock.Add(match); 
        sb.Append(html.Substring(last, m.Index - last));
        sb.Append(m.Result(string.Format(tempUserBlock, p, index++)));
    }
    last = m.Index + m.Length;
}
sb.Append(html.Substring(last));
html = sb.ToString(); 
like image 40
Gabe Avatar answered Sep 21 '22 23:09

Gabe