What is the fastest way to iterate through individual characters in a string in C#?

Q: What type of a structure is the best way to iterate through the characters of a string?

Using the character iterator is probably the only correct way to iterate over characters, because Unicode requires more space than a Java char provides. A Java char contains 16 bit and can hold Unicode characters up U+FFFF but Unicode specifies characters up to U+10FFFF.

Q: Can we iterate a string?

Another way to iterate over a string is to use for item of str . The variable item receives the character directly so you do not have to use the index. If your code does not need the index value of each character, this loop format is even simpler.

Tags:

string

c#

The title is the question. Below is my attempt to answer it through research. But I don't trust my uninformed research so I still pose the question (What is the fastest way to iterate through individual characters in a string in C#?).

Occasionally I want to cycle through the characters of a string one-by-one, such as when parsing for nested tokens -- something which cannot be done with regular expressions. I am wondering what the fastest way is to iterate through the individual characters in a string, particularly very large strings.

I did a bunch of testing myself and my results are below. However there are many readers with much more in depth knowledge of the .NET CLR and C# compiler so I don't know if I'm missing something obvious, or if I made a mistake in my test code. So I solicit your collective response. If anyone has insight into how the string indexer actually works that would be very helpful. (Is it a C# language feature compiled into something else behind the scenes? Or something built in to the CLR?).

The first method using a stream was taken directly from the accepted answer from the thread: how to generate a stream from a string?

Tests

longString is a 99.1 million character string consisting of 89 copies of the plain-text version of the C# language specification. Results shown are for 20 iterations. Where there is a 'startup' time (such as for the first iteration of the implicitly created array in method #3), I tested that separately, such as by breaking from the loop after the first iteration.

Results

From my tests, caching the string in a char array using the ToCharArray() method is the fastest for iterating over the entire string. The ToCharArray() method is an upfront expense, and subsequent access to individual characters is slightly faster than the built in index accessor.

                                           milliseconds                                 ---------------------------------  Method                         Startup  Iteration  Total  StdDev ------------------------------  -------  ---------  -----  ------  1 index accessor                     0        602    602       3  2 explicit convert ToCharArray     165        410    582       3  3 foreach (c in string.ToCharArray)168        455    623       3  4 StringReader                       0       1150   1150      25  5 StreamWriter => Stream           405       1940   2345      20  6 GetBytes() => StreamReader       385       2065   2450      35  7 GetBytes() => BinaryReader       385       5465   5850      80  8 foreach (c in string)              0        960    960       4

Update: Per @Eric's comment, here are results for 100 iterations over a more normal 1.1 M char string (one copy of the C# spec). Indexer and char arrays are still fastest, followed by foreach(char in string), followed by stream methods.

                                           milliseconds                                 ---------------------------------  Method                         Startup  Iteration  Total  StdDev ------------------------------  -------  ---------  -----  ------  1 index accessor                     0        6.6    6.6    0.11  2 explicit convert ToCharArray     2.4        5.0    7.4    0.30  3 for(c in string.ToCharArray)     2.4        4.7    7.1    0.33  4 StringReader                       0       14.0   14.0    1.21  5 StreamWriter => Stream           5.3       21.8   27.1    0.46  6 GetBytes() => StreamReader       4.4       23.6   28.0    0.65  7 GetBytes() => BinaryReader       5.0       61.8   66.8    0.79  8 foreach (c in string)              0       10.3   10.3    0.11

Code Used (tested separately; shown together for brevity)

//1 index accessor int strLength = longString.Length; for (int i = 0; i < strLength; i++) { c = longString[i]; }  //2 explicit convert ToCharArray int strLength = longString.Length; char[] charArray = longString.ToCharArray(); for (int i = 0; i < strLength; i++) { c = charArray[i]; }  //3 for(c in string.ToCharArray) foreach (char c in longString.ToCharArray()) { }   //4 use StringReader int strLength = longString.Length; StringReader sr = new StringReader(longString); for (int i = 0; i < strLength; i++) { c = Convert.ToChar(sr.Read()); }  //5 StreamWriter => StreamReader  int strLength = longString.Length; MemoryStream stream = new MemoryStream(); StreamWriter writer = new StreamWriter(stream); writer.Write(longString); writer.Flush(); stream.Position = 0; StreamReader str = new StreamReader(stream); while (stream.Position < strLength) { c = Convert.ToChar(str.Read()); }   //6 GetBytes() => StreamReader int strLength = longString.Length; MemoryStream stream = new MemoryStream(Encoding.Unicode.GetBytes(longString)); StreamReader str = new StreamReader(stream); while (stream.Position < strLength) { c = Convert.ToChar(str.Read()); }  //7 GetBytes() => BinaryReader  int strLength = longString.Length; MemoryStream stream = new MemoryStream(Encoding.Unicode.GetBytes(longString)); BinaryReader br = new BinaryReader(stream, Encoding.Unicode); while (stream.Position < strLength) { c = br.ReadChar(); }  //8 foreach (c in string) foreach (char c in longString) { }

Accepted answer:

I interpreted @CodeInChaos and Ben's notes as follows:

fixed (char* pString = longString) {     char* pChar = pString;     for (int i = 0; i < strLength; i++) {         c = *pChar ;         pChar++;     } }

Execution for 100 iterations over the short string was 4.4 ms, with < 0.1 ms st dev.

751

asked Jan 09 '12 19:01

Joshua Honig

1 Answers

Any reason not to include foreach?

foreach (char c in text) {     ... }

Is this really going to be your performance bottleneck, by the way? What proportion of your total running time does the iteration itself take?

answered Sep 18 '22 15:09

Jon Skeet

Related questions
                            
                                Insert a new row into DataTable
                            
                                Does C# support multiple inheritance?
                            
                                How do I delete a directory with read-only files in C#?
                            
                                How to resolve "'installutil' is not recognized as an internal or external command, operable program or batch file."?
                            
                                C# quickest way to shift array
                            
                                The entity type 'IdentityUserLogin<string>' requires a primary key to be defined [duplicate]
                            
                                Cannot be embedded. Use the applicable interface instead [duplicate]
                            
                                Finding the last index of an array
                            
                                How to get the last part of a string?
                            
                                Asynchronously commit or rollback a transaction scope
                            
                                How to detect when application terminates?
                            
                                What's the proper way to minimize to tray a C# WinForms app?
                            
                                What is the correct usage of ConcurrentBag?
                            
                                How to construct a Task without starting it?
                            
                                Problem understanding C# type inference as described in the language specification
                            
                                Pattern for calling WCF service using async/await
                            
                                Where is the MOQ documentation? [closed]
                            
                                I wrote a program that allow two classes to "fight". For whatever reason C# always wins. What's wrong with VB.NET?
                            
                                Differences between .ContextMenu and .ContextMenuStrip
                            
                                Why is casting a dynamic of type object to object throwing a null reference exception?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With