I have a string like this:
string s = "<p>Hello world, hello world</p>";
string[] terms = new string[] {"hello", "world"};
I want to do a replacement on this string such that each word (case-insensitive) will be matched, and replaced with a numbered index span tag like so:
<p>
<span id="m_1">Hello</span>
<span id="m_2">world</span>,
<span id="m_3">hello</span>
<span id="m_4">world</span>!
</p>
I tried doing it like this.
int match = 1;
Regex.Replace(s,
String.Join("|", String.Join("|", terms.OrderByDescending(s => s.Length)
.Select(Regex.Escape))),
String.Format("<span id=\"m_{0}\">$&</span>", match++),
RegexOptions.IgnoreCase);
The output is something like this:
<p>
<span id="m_1">Hello</span>
<span id="m_1">world</span>,
<span id="m_1">hello</span>
<span id="m_1">world</span>!
</p>
Where all the ids are the same (m_1) because the regex doesn't evaluate match++ for each match, but one for the whole Regex. How do I get around this?
All you need to do is to convert the replacement argument from a string pattern to a match evaluator (m => String.Format("<span id=\"m_{0}\">{1}</span>", match++, m.Value)
):
string s1 = "<p>Hello world, hello world</p>";
string[] terms = new string[] {"hello", "world"};
var match = 1;
s1 = Regex.Replace(s1,
String.Join("|", String.Join("|", terms.OrderByDescending(s => s.Length)
.Select(Regex.Escape))),
m => String.Format("<span id=\"m_{0}\">{1}</span>", match++, m.Value),
RegexOptions.IgnoreCase);
Console.Write(s1);
// => <p><span id="m_1">Hello</span> <span id="m_2">world</span>, <span id="m_3">hello</span> <span id="m_4">world</span></p>
See the C# demo
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With