Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fastest way to remove chars from string

Tags:

I have a string from which I have to remove following char: '\r', '\n', and '\t'. I have tried three different ways of removing these char and benchmarked them so I can get the fastest solution.

Following are the methods and there execution time when I ran them 1000000 times:

It should be fastest solution if I have 1 or 2 char to remove. But as I put in more char, it starts to take more time

str = str.Replace("\r", string.Empty).Replace("\n", string.Empty).Replace("\t", string.Empty); 

Execution time = 1695

For 1 or 2 char, this was slower then String.Replace, but for 3 char it showed better performance.

string[] split = str.Split(new char[] { '\t', '\r', '\n' }, StringSplitOptions.None); str = split.Aggregate<string>((str1, str2) => str1 + str2); 

Execution time = 1030

The slowest of all, even with 1 char. Maybe my regular expression is not the best.

str = Regex.Replace(str, "[\r\n\t]", string.Empty, RegexOptions.Compiled); 

Execution time = 3500

These are the three solutions I came up with. Is there any better and faster solution that anyone here know, or any improvements I can do in this code?

String that I used for benchmarking:

StringBuilder builder = new StringBuilder();         builder.AppendFormat("{0}\r\n{1}\t\t\t\r\n{2}\t\r\n{3}\r\n{4}\t\t\r\n{5}\r\n{6}\r\n{7}\r\n{8}\r\n{9}",          "SELECT ",          "[Extent1].[CustomerID] AS [CustomerID], ",          "[Extent1].[NameStyle] AS [NameStyle], ",          "[Extent1].[Title] AS [Title], ",            "[Extent1].[FirstName] AS [FirstName], ",            "[Extent1].[MiddleName] AS [MiddleName], ",            "[Extent1].[LastName] AS [LastName], ",            "[Extent1].[Suffix] AS [Suffix], ",            "[Extent1].[CompanyName] AS [CompanyName], ",            "[Extent1].[SalesPerson] AS [SalesPerson], ");         string str = builder.ToString(); 
like image 465
ata Avatar asked Feb 02 '10 07:02

ata


People also ask

How do you remove letters from a string?

You can also remove a specified character or substring from a string by calling the String. Replace(String, String) method and specifying an empty string (String. Empty) as the replacement.

How do I remove the last 3 letters from a string?

Use the String. slice() method to remove the last 3 characters from a string, e.g. const withoutLast3 = str. slice(0, -3); . The slice method will return a new string that doesn't contain the last 3 characters of the original string.


2 Answers

Here's the uber-fast unsafe version, version 2.

    public static unsafe string StripTabsAndNewlines(string s)     {         int len = s.Length;         char* newChars = stackalloc char[len];         char* currentChar = newChars;          for (int i = 0; i < len; ++i)         {             char c = s[i];             switch (c)             {                 case '\r':                 case '\n':                 case '\t':                     continue;                 default:                     *currentChar++ = c;                     break;             }         }         return new string(newChars, 0, (int)(currentChar - newChars));     } 

And here are the benchmarks (time to strip 1000000 strings in ms)

    cornerback84's String.Replace:         9433     Andy West's String.Concat:             4756     AviJ's char array:                     1374     Matt Howells' char pointers:           1163
like image 139
Matt Howells Avatar answered Oct 14 '22 13:10

Matt Howells


I believe you'll get the best possible performance by composing the new string as a char array and only convert it to a string when you're done, like so:

string s = "abc"; int len = s.Length; char[] s2 = new char[len]; int i2 = 0; for (int i = 0; i < len; i++) {     char c = s[i];     if (c != '\r' && c != '\n' && c != '\t')         s2[i2++] = c; } return new String(s2, 0, i2); 

EDIT: using String(s2, 0, i2) instead of Trim(), per suggestion

like image 40
AviJ Avatar answered Oct 14 '22 12:10

AviJ