Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extensive use of LOH causes significant performance issue

We have a Web Service using WebApi 2, .NET 4.5 on Server 2012. We were seeing occasional latency increases by 10-30ms with no good reason. We were able to track down the problematic piece of code to LOH and GC.

There is some text which we convert to its UTF8 byte representation (actually, the serialization library we use does that). As long as the text is shorter than 85000 bytes, latency is stable and short: ~0.2 ms on average and at 99%. As soon as the 85000 boundary is crossed, average latency increases to ~1ms while the 99% jumps to 16-20ms. Profiler shows that most of the time is spent in GC. To be certain, if I put GC.Collect between iterations, the measured latency goes back to 0.2ms.

I have two questions:

  1. Where does the latency come from? As far as I understand the LOH isn't compacted. SOH is being compacted, but doesn't show the latency.
  2. Is there a practical way to work around this? Note that I can’t control the size of the data and make it smaller.

--

public void PerfTestMeasureGetBytes()
{
    var text = File.ReadAllText(@"C:\Temp\ContactsModelsInferences.txt");
    var smallText = text.Substring(0, 85000 + 100);
    int count = 1000;
    List<double> latencies = new List<double>(count);
    for (int i = 0; i < count; i++)
    {
        Stopwatch sw = new Stopwatch();
        sw.Start();
        var bytes = Encoding.UTF8.GetBytes(smallText);
        sw.Stop();
        latencies.Add(sw.Elapsed.TotalMilliseconds);

        //GC.Collect(2, GCCollectionMode.Default, true);
    }

    latencies.Sort();
    Console.WriteLine("Average: {0}", latencies.Average());
    Console.WriteLine("99%: {0}", latencies[(int)(latencies.Count * 0.99)]);
}
like image 920
Alon Catz Avatar asked Dec 09 '14 14:12

Alon Catz


People also ask

What is Loh memory?

Newly allocated objects form a new generation of objects and are implicitly generation 0 collections. However, if they are large objects, they go on the large object heap (LOH), which is sometimes referred to as generation 3. Generation 3 is a physical generation that's logically collected as part of generation 2.

What is the large object heap?

The fourth heap is known as the Large Object Heap, or LOH. 'Big' objects go here – as the size at which an object may end up on this heap is 85,000 bytes, this usually means arrays with more than about 20,000 entries.

What is pinned object heap?

Pinned objects are prohibited from moving within the heap. Typically, such objects are used by some unmanaged code or may be a result of using the fixed statement. Total size of all objects (excluding pinned objects) allocated in the heap segment.


2 Answers

The performance problems usually come from two areas: allocation and fragmentation.

Allocation

The runtime guarantees clean memory so spends cycles cleaning it. When you allocate a large object, that's a lot of memory and starts to add milliseconds to a single allocation (when lets be honest, simple allocation in .NET is actually very fast, so we usually never care about this).

Fragmentation occurs when LOH objects are allocated then reclaimed. Until recently, the GC could not reorganise the memory to remove these old object "gaps", and thus could only fit the next object in that gap if it was the same size or smaller. Recently, the GC has been given the ability to compact the LOH, which removes this issue, but costs time during compaction.

My guess in your case is you are suffering from both issues and triggering GC runs, but it depends on how often your code is attempting to allocate items in the LOH. If you are doing lots of allocations, try the object pooling route. If you cannot control a pool effectively (lumpy object lifetimes or disparate usage patterns), try chunking the data you are working against to avoid it completely.


Your Options

I've encountered two approaches to the LOH:

  • Avoid it.
  • Use it, but realise you are using it and manage it explicitly.

Avoid it

This involves chunking your large object (usually an array of some sort) into, well, chunks that each fall under the LOH barrier. We do this when serialising large object streams. Works well, but an implementation would be specific to your environment so I'm hesitant to provide a coded example.

Use it

A simple way to tackle both allocation and fragmentation is long-lived objects. Explicitly make an empty array (or arrays) of a large size to accommodate your large object, and don't get rid of it (or them). Leave it around and re-use it like an object pool. You pay for this allocation, but can do this either on first use or during application idle time, but you pay less for re-allocation (because you aren't re-allocating) and lessen fragmentation issues because you aren't constantly asking to allocate stuff and you aren't reclaiming items (which causes the gaps in the first place).

That said, a halfway house may be in order. Reserve a section of memory up-front for an object pool. Done early, these allocations should be contiguous in memory so you won't get any gaps, and leave the tail end of the available memory for uncontrolled items. Do beware though that this obviously has an impact on the working set of your application - an object pool takes space regardless of it being used or not.


Resources

The LOH is covered a lot out in the web, but pay attention to the date of the resource. In the latest .NET versions the LOH has received some love, and has improved. That said, if you are on an older version I think the resources on the net are fairly accurate as the LOH never really received any serious updates in a long time between inception and .NET 4.5 (ish).

For example, there is this article from 2008 http://msdn.microsoft.com/en-us/magazine/cc534993.aspx

And a summary of improvements in .NET 4.5: http://blogs.msdn.com/b/dotnet/archive/2011/10/04/large-object-heap-improvements-in-net-4-5.aspx

like image 130
Adam Houldsworth Avatar answered Sep 30 '22 00:09

Adam Houldsworth


In addition to the following, make sure that you're using the server garbage collector. That doesn't affect how the LOH is used, but my experience is that it does significantly reduce the amount of time spent in GC.

The best work around I found for avoiding large object heap problems is to create a persistent buffer and re-use it. So rather than allocating a new byte array with every call to Encoding.GetBytes, pass the byte array to the method.

In this case, use the GetBytes overload that takes a byte array. Allocate an array that's large enough to hold the bytes for your longest expected string, and keep it around. For example:

// allocate buffer at class scope
private byte[] _theBuffer = new byte[1024*1024];

public void PerfTestMeasureGetBytes()
{
    // ...
    for (...)
    {
        var sw = Stopwatch.StartNew();
        var numberOfBytes = Encoding.UTF8.GetBytes(smallText, 0, smallText.Length, _theBuffer, 0);
        sw.Stop();
        // ...
    }

The only problem here is that you have to make sure your buffer is large enough to hold the largest string. What I've done in the past is to allocate the buffer to the largest size I expect, but then check to make sure it's large enough whenever I go to use it. If it's not large enough, then re-allocate it. How you do that depends on how rigorous you want to be. When working with primarily Western European text, I'd just double the string length. For example:

string textToConvert = ...
if (_theBuffer.Length < 2*textToConvert.Length)
{
    // reallocate the buffer
    _theBuffer = new byte[2*textToConvert.Length];
}

Another way to do it is to just try the GetString, and reallocate on failure. Then retry. For example:

while (!good)
{
    try
    {
        numberOfBytes = Encoding.UTF8.GetString(theString, ....);
        good = true;
    }
    catch (ArgumentException)
    {
        // buffer isn't big enough. Find out how much I really need
        var bytesNeeded = Encoding.UTF8.GetByteCount(theString);
        // and reallocate the buffer
        _theBuffer = new byte[bytesNeeded];
    }
}

If you make the buffer's initial size large enough to accommodate the largest string you expect, then you probably won't get that exception very often. Which means that the number of times you have to reallocate the buffer will be very small. You could, of course, add some padding to the bytesNeeded so that you allocate more, in case you have some other outliers.

like image 21
Jim Mischel Avatar answered Sep 29 '22 23:09

Jim Mischel