Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fastest way to implement Duplicate Character Removal in String (C#)

Tags:

string

c#

.net

In C#, what is the fastest way to detect duplicate characters in a String and remove them (removal including 1st instance of the duplicated character)?

Example Input: nbHHkRvrXbvkn

Example Output: RrX

like image 253
Alex Avatar asked Aug 27 '09 21:08

Alex


People also ask

How do I remove a duplicate character from a string?

We can remove the duplicate characters from a string by using the simple for loop, sorting, hashing, and IndexOf() method.

How do I remove a repeating character from a string in C #?

Use Hashset to remove duplicate characters. string myStr = "kkllmmnnoo"; Now, use HashSet to map the string to char. This will remove the duplicate characters from a string.

Which methods is used to remove duplicates?

Sets, built-in functions, and iterative methods can be used to remove duplicates from a list.


2 Answers

Fastest as in fewest-lines-of-code:

var s = "nbHHkRvrXbvkn";
var duplicates = s.Where(ch => s.Count(c => c == ch) > 1);
var result = new string(s.Except(duplicates).ToArray()); // = "RrX"

Fastest as in fastest-performance would probably be something like this (does not preserve order):

var h1 = new HashSet<char>();
var h2 = new HashSet<char>();

foreach (var ch in "nbHHkRvrXbvkn")
{
    if (!h1.Add(ch))
    {
        h2.Add(ch);
    }
}

h1.ExceptWith(h2); // remove duplicates

var chars = new char[h1.Count];
h1.CopyTo(chars);
var result = new string(chars); // = "RrX"

Performance test

When in doubt -- test it :)

Yuriy Faktorovich's answer        00:00:00.2360900
Luke's answer                     00:00:00.2225683
My 'few lines' answer             00:00:00.5318395
My 'fast' answer                  00:00:00.1842144
like image 155
dtb Avatar answered Oct 16 '22 15:10

dtb


Here is a pretty fast one preserving order. But I'd be a little worried about how LINQ does Group and Where:

string s = "nbHHkRvrXbvkn";
Console.WriteLine( 
    s.ToCharArray()
        .GroupBy(c => c)
        .Where(g => g.Count() == 1)
        .Aggregate(new StringBuilder(), (b, g) => b.Append(g.Key)));

Edit: This one beats Luke's in some cases still slower than dtb's, but it preserves the order

private static string MyMethod(string s)
{
    StringBuilder sb = new StringBuilder(s.Length);
    foreach (var g in s.ToCharArray().GroupBy(c => c))
        if (g.Count() == 1) sb.Append(g.Key);

    return sb.ToString();
}
like image 42
Yuriy Faktorovich Avatar answered Oct 16 '22 14:10

Yuriy Faktorovich