Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get total number of allocations in C#

Is there a way to get the total number of allocations (note - number of allocations, not bytes allocated)? It can be either for the current thread, or globally, whichever is easier.

I want to check how many objects a particular function allocates, and while I know about the Debug -> Performance Profiler (Alt+F2), I would like to be able to do it programmatically from inside my program.

// pseudocode
int GetTotalAllocations() {
    ...;
}    
class Foo {
    string bar;
    string baz;
}
public static void Main() {
    int allocationsBefore = GetTotalAllocations();
    PauseGarbageCollector(); // do I need this? I don't want the GC to run during the function and skew the number of allocations
    // Some code that makes allocations.
    var foo = new Foo() { bar = "bar", baz = "baz" };
    ResumeGarbageCollector();
    int allocationsAfter = GetTotalAllocations();
    Console.WriteLine(allocationsAfter - allocationsBefore); // Should print 3 allocations - one for Foo, and 2 for its fields.
}

Also, do I need to pause garbage collection to get accurate data, and can I do that?

Do I need to use the CLR Profiling API to achieve that?

like image 462
sashoalm Avatar asked Apr 07 '20 13:04

sashoalm


2 Answers

You can record every allocation. But your logic to do this inside your process is flawed. .NET Core supports in process ETW data collection which makes it also possible to record all allocation events. See

  • https://docs.microsoft.com/en-us/dotnet/core/whats-new/dotnet-core-2-2
  • https://devblogs.microsoft.com/dotnet/a-portable-way-to-get-gc-events-in-process-and-no-admin-privilege-with-10-lines-of-code-and-ability-to-dynamically-enable-disable-events/

Starting with .NET Core 2.2, CoreCLR events can now be consumed using the System.Diagnostics.Tracing.EventListener class. These events describe the behavior of such runtime services as GC, JIT, ThreadPool, and interop. These are the same events that are exposed as part of the CoreCLR ETW provider. This allows for applications to consume these events or use a transport mechanism to send them to a telemetry aggregation service. You can see how to subscribe to events in the following code sample:

internal sealed class SimpleEventListener : EventListener
{
    // Called whenever an EventSource is created.
    protected override void OnEventSourceCreated(EventSource eventSource)
    {
        // Watch for the .NET runtime EventSource and enable all of its events.
        if (eventSource.Name.Equals("Microsoft-Windows-DotNETRuntime"))
        {
            EnableEvents(eventSource, EventLevel.Verbose, (EventKeywords)(-1));
        }
    }

    // Called whenever an event is written.
    protected override void OnEventWritten(EventWrittenEventArgs eventData)
    {
        // Write the contents of the event to the console.
        Console.WriteLine($"ThreadID = {eventData.OSThreadId} ID = {eventData.EventId} Name = {eventData.EventName}");
        for (int i = 0; i < eventData.Payload.Count; i++)
        {
            string payloadString = eventData.Payload[i]?.ToString() ?? string.Empty;
            Console.WriteLine($"\tName = \"{eventData.PayloadNames[i]}\" Value = \"{payloadString}\"");
        }
        Console.WriteLine("\n");
    }
}

That should be giving when you enable GC evets (0x1) instead of -1 all the GC pause times and GC events you would need to diagnose yourself in-process.

There are allocation sampling mechanism built into .NET Core and .NET Framework since ages which enable sampling object allocation metrics on every up to 5 alloc events/s GC_Alloc_Low or 100 alloc events/s GC_Alloc_High allocated object. There seems no way to get all allocation events but if you read the .NET Core code

BOOL ETW::TypeSystemLog::IsHeapAllocEventEnabled()
{
    LIMITED_METHOD_CONTRACT;

    return
        // Only fire the event if it was enabled at startup (and thus the slow-JIT new
        // helper is used in all cases)
        s_fHeapAllocEventEnabledOnStartup &&

        // AND a keyword is still enabled.  (Thus people can turn off the event
        // whenever they want; but they cannot turn it on unless it was also on at startup.)
        (s_fHeapAllocHighEventEnabledNow || s_fHeapAllocLowEventEnabledNow);
}

you find that you can get all allocation events via ETW when

  1. ETW Allocation profiling must be enabled when the process is started (enabling later will NOT work)
  2. GC_Alloc_High AND GC_Allow_Low keywords are enabled

You can record all allocations inside a .NET Core 2.1+ process if an ETW session which record allocation profiling data is present.

Sample:

C>perfview collect  c:\temp\perfViewOnly.etl -Merge:true -Wpr -OnlyProviders:"Microsoft-Windows-DotNETRuntime":0x03280095::@StacksEnabled=true
C>AllocTracker.exe
    Microsoft-Windows-DotNETRuntime
    System.Threading.Tasks.TplEventSource
    System.Runtime
    Hello World!
    Did allocate 24 bytes
    Did allocate 24 bytes
    Did allocate 24 bytes
    Did allocate 76 bytes
    Did allocate 76 bytes
    Did allocate 32 bytes
    Did allocate 64 bytes
    Did allocate 24 bytes
    ... endless loop!

    using System;
    using System.Diagnostics.Tracing;

    namespace AllocTracker
    {
        enum ClrRuntimeEventKeywords
        {
            GC = 0x1,
            GCHandle = 0x2,
            Fusion = 0x4,
            Loader = 0x8,
            Jit = 0x10,
            Contention = 0x4000,
            Exceptions                   = 0x8000,
            Clr_Type                    = 0x80000,
            GC_AllocHigh =               0x200000,
            GC_HeapAndTypeNames       = 0x1000000,
            GC_AllocLow        =        0x2000000,
        }

        class SimpleEventListener : EventListener
        {
            public ulong countTotalEvents = 0;
            public static int keyword;

            EventSource eventSourceDotNet;

            public SimpleEventListener() { }

            // Called whenever an EventSource is created.
            protected override void OnEventSourceCreated(EventSource eventSource)
            {
                Console.WriteLine(eventSource.Name);
                if (eventSource.Name.Equals("Microsoft-Windows-DotNETRuntime"))
                {
                    EnableEvents(eventSource, EventLevel.Informational, (EventKeywords) (ClrRuntimeEventKeywords.GC_AllocHigh | ClrRuntimeEventKeywords.GC_AllocLow) );
                    eventSourceDotNet = eventSource;
                }
            }
            // Called whenever an event is written.
            protected override void OnEventWritten(EventWrittenEventArgs eventData)
            {
                if( eventData.EventName == "GCSampledObjectAllocationHigh")
                {
                    Console.WriteLine($"Did allocate {eventData.Payload[3]} bytes");
                }
                    //eventData.EventName
                    //"BulkType"
                    //eventData.PayloadNames
                    //Count = 2
                    //    [0]: "Count"
                    //    [1]: "ClrInstanceID"
                    //eventData.Payload
                    //Count = 2
                    //    [0]: 1
                    //    [1]: 11

                    //eventData.PayloadNames
                    //Count = 5
                    //    [0]: "Address"
                    //    [1]: "TypeID"
                    //    [2]: "ObjectCountForTypeSample"
                    //    [3]: "TotalSizeForTypeSample"
                    //    [4]: "ClrInstanceID"
                    //eventData.EventName
                    //"GCSampledObjectAllocationHigh"
            }
        }

        class Program
        {
            static void Main(string[] args)
            {
                SimpleEventListener.keyword = (int)ClrRuntimeEventKeywords.GC;
                var listener = new SimpleEventListener();

                Console.WriteLine("Hello World!");

                Allocate10();
                Allocate5K();
                GC.Collect();
                Console.ReadLine();
            }
            static void Allocate10()
            {
                for (int i = 0; i < 10; i++)
                {
                    int[] x = new int[100];
                }
            }

            static void Allocate5K()
            {
                for (int i = 0; i < 5000; i++)
                {
                    int[] x = new int[100];
                }
            }
        }

    }

Now you can find all allocation events in the recorded ETL file. A method allocating 10 and another one with 5000 array allocations.

PerfView Allocation Recording

The reason why I did tell you that you logic is flawed is that even a simple operation like printing the allocation events to console will allocate objects. You see where this will end up? If you want to achieve that the complete code path must be allocation free which is not possible I guess because at least the ETW event listener needs to allocate your event data. You have reached the goal but crashed your application. I would therefore rely on ETW and record the data from the outside or with a profiler which needs for the same reason to be unmanaged.

With ETW you get all allocation stacks and type information which is all you need not only to report but also to find the offending code snippet. There is more to it about method inlining but that is already enough for an SO post I guess.

like image 172
Alois Kraus Avatar answered Sep 22 '22 06:09

Alois Kraus


First up, you can pause the GC by calling System.GC.TryStartNoGCRegion and unpause it with System.GC.EndNoGCRegion.

For only knowing how many bytes got allocated, there is System.GC.GetAllocatedBytesForCurrentThread which returns the total bytes allocated for the current thread. Call it before and after the code to measure and the difference is the allocation size.

Counting the number of allocations is a little bit tricky. There are possibly quite a few ways to do it which are all sub-optimal in some way today. I can think of one idea:

Modifying the default GC

Starting with .NET Core 2.1 there is the possibility to use a custom GC, a so called local GC. It's said that the development experience, documentation and usefulness is not the best, but depending on the details of your problem it can be helpful for you.

Every time an object is allocated the runtime calls Object* IGCHeap::Alloc(gc_alloc_context * acontext, size_t size, uint32_t flags). IGCHeap is defined here with the default GC implementation here (GCHeap::Alloc implemented in line 37292).

The guy to talk to here would be Konrad Kokosa with two presentations on that topic: #1, #2, slides.

We can take the default GC implementation as is and modify the Alloc-method to increment a counter on each call.

Exposing the counter in managed code

Next up to make use of the new counter, we need a way to consume it from managed code. For that we need to modify the runtime. Here I'll describe on how to do that by expanding the GC interface (exposed by System.GC).

Note: I do not have practical experience in doing this and there are probably some problems to encounter when going this route. I just want to be precise with my idea.

By taking a look at ulong GC.GetGenerationSize(int) we are able to find out how to add a method which results in an internal CLR call.

Open \runtime\src\coreclr\src\System.Private.CoreLib\src\System\GC.cs#112 and declare a new method:

[MethodImpl(MethodImplOptions.InternalCall)]
internal static extern ulong GetAllocationCount();

Next, we need to define that method on the native GCInterface. For that, got to runtime\src\coreclr\src\vm\comutilnative.h#112 and add:

static FCDECL0(UINT64, GetAllocationCount);

To link these two methods, we need to list them in runtime\src\coreclr\src\vm\ecalllist.h#745:

FCFuncElement("GetAllocationCount", GCInterface::GetAllocationCount)

And lastly, actually implementing the method at runtime\src\coreclr\src\vm\comutilnative.cpp#938:

FCIMPL0(UINT64, GCInterface::GetAllocationCount)
{
    FCALL_CONTRACT;

    return (UINT64)(GCHeapUtilities::GetGCHeap()->GetAllocationCount());
}
FCIMPLEND

That would get a pointer to the GCHeap where our allocation counter lives. The method GetAllocationCount that exposes this on it does not exists yet, so let's create it:

runtime\src\coreclr\src\gc\gcimpl.h#313

size_t GetAllocationCount();

runtime\src\coreclr\src\gc\gcinterface.h#680

virtual size_t GetAllocationCount() = 0;

runtime\src\coreclr\src\gc\gcee.cpp#239

size_t GCHeap::GetAllocationCount()
{
    return m_ourAllocationCounter;
}

For our new method System.GC.GetAllocationCount() to be usable in managed code we need to compile against a custom BCL. Maybe a custom NuGet package will work here too (which defines System.GC.GetAllocationCount() as an internal call as seen above).

Closing

Admittedly, this would be quite a bit of work if not done before and a custom GC + CLR might be a bit overkill here, but I thought I should throw it out there as a possibility.

Also, I have not tested this. You should take it as a concept.

like image 44
Bruno Zell Avatar answered Sep 18 '22 06:09

Bruno Zell