In my MatchCollection, I get matches of the same thing. Like this:
string text = @"match match match";
Regex R = new Regex("match");
MatchCollection M = R.Matches(text);
How does one remove duplicate matches and is it the fastest way possible?
Assume "duplicate" here means that the match contains the exact same string.
If you are using .Net 3.5 or greater such as 4.7, linq can be used to remove the duplicates of the match.
string data = "abc match match abc";
Console.WriteLine(string.Join(", ",
Regex.Matches(data, @"([^\s]+)")
.OfType<Match>()
.Select (m => m.Groups[0].Value)
.Distinct()
));
// Outputs abc, match
Place it into a hastable then extract the strings:
string data = "abc match match abc";
MatchCollection mc = Regex.Matches(data, @"[^\s]+");
Hashtable hash = new Hashtable();
foreach (Match mt in mc)
{
string foundMatch = mt.ToString();
if (hash.Contains(foundMatch) == false)
hash.Add(foundMatch, string.Empty);
}
// Outputs abc and match.
foreach (DictionaryEntry element in hash)
Console.WriteLine (element.Key);
Try
Regex rx = new Regex(@"\b(?<word>\w+)\s+(\k<word>)\b", RegexOptions.Compiled);
string text = @"match match match";
MatchCollection matches = rx.Matches(text);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With