Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does allocating objects of the same size improve GC or "new" performance?

Suppose we have to create many small objects of byte array type. The size varies but it always below 1024 bytes , say 780,256,953....

Will it improve operator new or GC efficiency over time if we always allocate only bytes[1024], and use only space needed?

UPD: This is short living objects, created for parsing binary protocol messages.

UPD: The number of the objects is the same in both cases, it just the size of allocation which changes (random vs. always 1024).

In C++ it would matter because of fragmentation and C++ new performance. But in C#....

like image 637
Boris Avatar asked Dec 29 '11 12:12

Boris


1 Answers

Will it improve operator new or GC efficiency over time if we always allocate only bytes[1024], and use only space needed?

Maybe. You're going to have to profile it and see.

The way we allocate syntax tree nodes inside the Roslyn compiler is quite interesting, and I'm eventually going to do a blog post about it. Until then, the relevant bit to your question is this interesting bit of trivia. Our allocation pattern typically involves allocating an "underlying" immutable node (which we call the "green" node) and a "facade" mutable node that wraps it (which we call the "red" node). As you might imagine, it is frequently the case that we end up allocating these in pairs: green, red, green, red, green, red.

The green nodes are persistent and therefore long-lived; the facades are short-lived, because they are discarded on every edit. Therefore it is frequently the case that the garbage collector has green / hole / green / hole / green / hole, and then the green nodes move up a generation.

Our assumption had always been that making data structures smaller will always improve GC performance. Smaller structures equals less memory allocated, equals less collection pressure, equals fewer collections, equals more performance, right? But we discovered through profiling that making the red nodes smaller in this scenario actually decreases GC performance. Something about the particular size of the holes affects the GC in some odd way; not being an expert on the internals of the garbage collector, it is opaque to me why that should be.

So is it possible that changing the size of your allocations can affect the GC in some unforseen way? Yes, it is possible. But, first off, it is unlikely, and second it is impossible to know whether you are in that situation until you actually try it in real-world scenarios and carefully measure GC performance.

And of course, you might not be gated on GC performance. Roslyn does so many small allocations that it is crucial that we tune our GC-impacting behaviour, but we do an insane number of small allocations. The vast majority of .NET programs do not stress the GC the way we do. If you are in the minority of programs that stress the GC in interesting ways then there is no way around it; you're going to have to profile and gather empirical data, just like we do on the Roslyn team.

If you are not in that minority, then don't worry about GC performance; you probably have a bigger problem somewhere else that you should be dealing with first.

like image 58
Eric Lippert Avatar answered Nov 06 '22 05:11

Eric Lippert