String vs byte array, Performance

Tags:

(This post is regarding High Frequency type programming)

I recently saw on a forum (I think they were discussing Java) that if you have to parse a lot of string data its better to use a byte array than a string with a split(). The exact post was:

One performance trick to working with any language, C++, Java, C# is to avoid object creation. It's not the cost of allocation or GC, its the cost to access large memory arrays that dont fit in the CPU cache.

Modern CPU's are much faster than their memory. They stall for many, many cycles for each cache miss. Most of the CPU transister budget is allocated to reduce this with large caches and lots of ticks.

GPU's solve the problem differently by having lots of threads ready to execute to hide memory access latency and have little or no cache and spend the transistors on more cores.

So, for example, rather than using String's and split to parse a message, use byte arrays that can be updated in place. You really want to avoid random memory access over large data structures, at least in the inner loops.

Is he just saying "dont use strings because they're an object and creating objects is costly" ? Or is he saying something else?

Does using a byte array ensure the data remains in the cache for as long as possible? When you use a string is it too large to be held in the CPU cache? Generally, is using the primitive data types the best methods for writing faster code?

510

asked Oct 24 '11 13:10

user997112

1 Answers

He's saying that if you break a chunk text up into separate string objects, those string objects have worse locality than the large array of text. Each string, and the array of characters it contains, is going to be somewhere else in memory; they can be spread all over the place. It is likely that the memory cache will have to thrash in and out to access the various strings as you process the data. In contrast, the one large array has the best possible locality, as all the data is on one area of memory, and cache-thrashing will be kept to a minimum.

There are limits to this, of course: if the text is very, very large, and you only need to parse out part of it, then those few small strings might fit better in the cache than the large chunk of text.

120

answered Sep 18 '22 14:09

Ernest Friedman-Hill

Related questions
                            
                                How to get Code Page by Language-Culture?
                            
                                Dynamic list of checkboxes and model binding
                            
                                ASP.NET MVC "The call is ambiguous" Error (System.IO.TextWriter.Write)
                            
                                Invalid length for a Base-64 char array during decoding/decryption
                            
                                Understanding floating point problems
                            
                                Detect a specific frequency/tone from raw wave-data
                            
                                Sorting DataTable string column, but with null/empty at the bottom
                            
                                Why should I not make a class Serializable?
                            
                                Checking a Queue<T> Continuously
                            
                                While trying to retrieve the authorization groups, an error (5) occurred
                            
                                Is there an easy way to make EntityFramework use SQL default values?
                            
                                Problem with hosting WCF service in a Windows Service
                            
                                Save and reload app.config(applicationSettings) at runtime
                            
                                How does custom syntax highlighting in Scintilla work (and why doesn't mine)?
                            
                                C# xml documentation: How to create Notes?
                            
                                DataAdapter.Update() does not Update the Database
                            
                                How do I mock this?
                            
                                How to create *.docx files from a template in C#
                            
                                Where to 'locate' C# structs? / how to organize structs within a project
                            
                                BsonValue and custom classes in MongoDB C# Driver

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

String vs byte array, Performance

Tags:

java

c++

c#

oop

user997112

People also ask

1 Answers

Ernest Friedman-Hill

Recent Activity

Donate For Us