In C#, what is the fastest way to detect duplicate characters in a String and remove them (removal including 1st instance of the duplicated character)? Example Input: <code>nbHHkRvrXbvkn</code> Example Output: <code>RrX</code>

Fastest as in fewest-lines-of-code: <pre class="prettyprint"><code>var s = "nbHHkRvrXbvkn"; var duplicates = s.Where(ch => s.Count(c => c == ch) > 1); var result = new string(s.Except(duplicates).ToArray()); // = "RrX" </code></pre> Fastest as in fastest-performance would probably be something like this (does not preserve order): <pre class="prettyprint"><code>var h1 = new HashSet<char>(); var h2 = new HashSet<char>(); foreach (var ch in "nbHHkRvrXbvkn") { if (!h1.Add(ch)) { h2.Add(ch); } } h1.ExceptWith(h2); // remove duplicates var chars = new char[h1.Count]; h1.CopyTo(chars); var result = new string(chars); // = "RrX" </code></pre> <hr> Performance test When in doubt -- test it :) <pre class="prettyprint"> Yuriy Faktorovich's answer 00:00:00.2360900 Luke's answer 00:00:00.2225683 My 'few lines' answer 00:00:00.5318395 My 'fast' answer 00:00:00.1842144 </pre>

Here is a pretty fast one preserving order. But I'd be a little worried about how LINQ does Group and Where: <pre class="prettyprint"><code>string s = "nbHHkRvrXbvkn"; Console.WriteLine( s.ToCharArray() .GroupBy(c => c) .Where(g => g.Count() == 1) .Aggregate(new StringBuilder(), (b, g) => b.Append(g.Key))); </code></pre> Edit: This one beats Luke's in some cases still slower than dtb's, but it preserves the order <pre class="prettyprint"><code>private static string MyMethod(string s) { StringBuilder sb = new StringBuilder(s.Length); foreach (var g in s.ToCharArray().GroupBy(c => c)) if (g.Count() == 1) sb.Append(g.Key); return sb.ToString(); } </code></pre>

Fastest way to implement Duplicate Character Removal in String (C#)

2 Answers

Fastest as in fewest-lines-of-code:

var s = "nbHHkRvrXbvkn";
var duplicates = s.Where(ch => s.Count(c => c == ch) > 1);
var result = new string(s.Except(duplicates).ToArray()); // = "RrX"

Fastest as in fastest-performance would probably be something like this (does not preserve order):

var h1 = new HashSet<char>();
var h2 = new HashSet<char>();

foreach (var ch in "nbHHkRvrXbvkn")
{
    if (!h1.Add(ch))
    {
        h2.Add(ch);
    }
}

h1.ExceptWith(h2); // remove duplicates

var chars = new char[h1.Count];
h1.CopyTo(chars);
var result = new string(chars); // = "RrX"

Performance test

When in doubt -- test it :)

Yuriy Faktorovich's answer        00:00:00.2360900
Luke's answer                     00:00:00.2225683
My 'few lines' answer             00:00:00.5318395
My 'fast' answer                  00:00:00.1842144

155

answered Oct 16 '22 15:10

dtb

Here is a pretty fast one preserving order. But I'd be a little worried about how LINQ does Group and Where:

string s = "nbHHkRvrXbvkn";
Console.WriteLine( 
    s.ToCharArray()
        .GroupBy(c => c)
        .Where(g => g.Count() == 1)
        .Aggregate(new StringBuilder(), (b, g) => b.Append(g.Key)));

Edit: This one beats Luke's in some cases still slower than dtb's, but it preserves the order

private static string MyMethod(string s)
{
    StringBuilder sb = new StringBuilder(s.Length);
    foreach (var g in s.ToCharArray().GroupBy(c => c))
        if (g.Count() == 1) sb.Append(g.Key);

    return sb.ToString();
}

answered Oct 16 '22 14:10

Yuriy Faktorovich

Related questions
                            
                                using ftpWebRequest with an error: the remote server returned error 530 not logged in
                            
                                Find node when traversing tree
                            
                                Read values from a Dynamic Object C#
                            
                                How to split string between different chars
                            
                                MVC 4 - How to conditionally disable this button?
                            
                                Entity Framework type initializer exception
                            
                                copy files with authentication in c#
                            
                                Why ArgumentNullException? Why not System.NullReferenceException?
                            
                                Uses of Datatype.EmailAddress in asp/.net/mvc
                            
                                Operator '??' cannot be applied to operands of type 'System.DateTime'
                            
                                DataTrigger for Textblock
                            
                                Add element to null (empty) List<T> Property [duplicate]
                            
                                Entity Framework - The migrations configuration type was not be found in the assembly
                            
                                Azure Functions V2 Deserialize HttpRequest as object
                            
                                can't find IWebHostEnvironment in Microsoft.AspNetCore.Hosting.Abstractions assembly in a .NET Core class library
                            
                                Creating a custom error page in Umbraco CMS
                            
                                (0 == variable) or (null == obj): An outdated practice in C#? [duplicate]
                            
                                What does a C# for loop do when all the expressions are missing. eg for(;;) {}
                            
                                How to do an application without form in C#? [duplicate]
                            
                                C# Directive to indicate 32-bit or 64-bit build

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Fastest way to implement Duplicate Character Removal in String (C#)

Tags:

string

c#

.net

Alex

People also ask

2 Answers

dtb

Yuriy Faktorovich

Recent Activity

Donate For Us