Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Performance Metrics/Diagnostics of .NET Tasks

Is there a way to get data out from .NET about the number of (C# 5 or newer, so post-async/await) Tasks pending execution, and similar metrics, for use in diagnosing issues happening to production servers?

The case I'm discussing is an async-all-the-way-down system (for example a massive parallel socket server where each request is run async from the start) where the initial Task either spawns multiple Tasks that each take time to process (or each start more Tasks), or spawns Tasks some of which block (like third party code) and some of which work async correctly. There are two situations I've seen which are difficult to diagnose usefully:

  • Under normal load, everything works fine, but if there are enough requests then the CPU very quickly jumps up to 100% and all requests finish slower and slower. When load lessens, CPU will stay at 100% until the majority of the pending tasks are finished gradually, and then the CPU drops down to normal levels.
  • Under normal load, everything works fine, but if there are enough requests then certain requests (all of which are properly async) no longer finish at all or very slowly. When the load lessens, CPU will stay at 100% while they're all processed, but there will be speed-bumps in the task completion rate where it slows down considerably for a short time.

I've tried writing up a simple test for this, but there's no apparent way to limit the number of executors and the number of tasks I need to create to test it makes it very hard to parse the information. It's also quite hard to not interfere with the test itself by attempting to log out debug information. I will continue attempting to create a better test case and amend my question if needed.

From my understanding of the problem and the async task system, both of these are effectively contention on the executors that actually run the tasks.

The first case happens because more Tasks are being created than are actually finishing, in which case a counter of pending tasks would be useful to diagnose this even before the load is high enough to lock up the service.

The second case happens because a certain set of Tasks goes long enough without yielding that over time (with enough load) all the executors end up running these tasks at the same time. And as soon as one finishes, it'll process some Tasks, only to be replaced by another long-running task soon enough. In this case, a pending tasks counter would be useful, as well as some other metrics.

Is anything of the sort available, or is there some undocumented/hacky way to graft a bit of code to the start/end of every task started in an application, to make it log out/measure these things and throw warnings when the task number is exploding?

like image 996
Mike Roibu Avatar asked May 17 '18 18:05

Mike Roibu


People also ask

What is performance counter in C#?

Performance counters enable us to publish, capture, and analyze the performance data of running code. A performance graph is a two-dimensional plot with one axis indicating time elapsed and the other reporting relevant relative or actual performance statistics.

Which tools can be used to measure the CPU and memory use of an ASP.NET application?

Visual Studio Diagnostic Tools The tooling allows analysis of CPU usage, memory usage, and performance events in ASP.NET Core apps.


Video Answer


1 Answers

You can inherit a class from EventListener to handle events that Task Parallel Library produces. Probably, you can count queued and running tasks and store profiling information associated with a task in a ConcurrentDictionary this way. However, there are complications such as non-uniqueness of task ids or performance implications of this profiling.

Example implementation:

public class TplEventListener : EventListener
{
    static readonly Guid _tplSourceGuid = new Guid("2e5dba47-a3d2-4d16-8ee0-6671ffdcd7b5");
    readonly EventLevel _handledEventsLevel;

    public TplEventListener(EventLevel handledEventsLevel)
    {
        _handledEventsLevel = handledEventsLevel;
    }

    protected override void OnEventSourceCreated(EventSource eventSource)
    {
        if (eventSource.Guid == _tplSourceGuid)
            EnableEvents(eventSource, _handledEventsLevel);
    }

    protected override void OnEventWritten(EventWrittenEventArgs eventData)
    {
        if (eventData.EventSource.Guid != _tplSourceGuid)
            return;

        switch (eventData.EventId)
        {
            // TODO: Add case for each relevant EventId (such as TASKSCHEDULED_ID and TASKWAITBEGIN_ID)
            // and explore relevant data (such as task Id) in eventData.Payload. Payload is described by 
            // eventData.PayloadNames.
            // For event ids and payload meaning explore TplEtwProvider source code 
            // (https://referencesource.microsoft.com/#mscorlib/system/threading/Tasks/TPLETWProvider.cs,183).
            default:
                var message = new StringBuilder();
                message.Append(eventData.EventName);
                message.Append("(");
                message.Append(eventData.EventId);
                message.Append(") { ");
                if (!string.IsNullOrEmpty(eventData.Message))
                {
                    message.Append("Message = \"");
                    message.AppendFormat(eventData.Message, eventData.Payload.ToArray());
                    message.Append("\", ");
                }
                for (var i = 0; i < eventData.Payload.Count; ++i)
                {
                    message.Append(eventData.PayloadNames[i]);
                    message.Append(" = ");
                    message.Append(eventData.Payload[i]);
                    message.Append(", ");
                }
                message[message.Length - 2] = ' ';
                message[message.Length - 1] = '}';
                Console.WriteLine(message);
                break;
        }
    }
}

Initialize and store new TplEventListener(EventLevel.LogAlways) in each AppDomain and you will get log similar to:

NewID(26) { TaskID = 1 }
TaskScheduled(7) { Message = "Task 1 scheduled to TaskScheduler 1.", OriginatingTaskSchedulerID = 1, OriginatingTaskID = 0, TaskID = 1, CreatingTaskID = 0, TaskCreationOptions = 8192 }
NewID(26) { TaskID = 2 }
TraceOperationBegin(14) { TaskID = 2, OperationName = Task.ContinueWith: < SendAsync > b__0, RelatedContext = 0 }
TaskStarted(8) { Message = "Task 1 executing.", OriginatingTaskSchedulerID = 1, OriginatingTaskID = 0, TaskID = 1 }
AwaitTaskContinuationScheduled(12) { OriginatingTaskSchedulerID = 1, OriginatingTaskID = 0, ContinuwWithTaskId = 2 }
NewID(26) { TaskID = 3 }
TraceOperationBegin(14) { TaskID = 3, OperationName = Async: < Main > d__3, RelatedContext = 0 }
NewID(26) { TaskID = 4 }
TaskWaitBegin(10) { Message = "Beginning wait (2) on Task 4.", OriginatingTaskSchedulerID = 1, OriginatingTaskID = 0, TaskID = 4, Behavior = 2, ContinueWithTaskID = 3 }
TaskWaitBegin(10) { Message = "Beginning wait (1) on Task 3.", OriginatingTaskSchedulerID = 1, OriginatingTaskID = 0, TaskID = 3, Behavior = 1, ContinueWithTaskID = 0 }
TraceSynchronousWorkBegin(17) { TaskID = 1, Work = 2 }
TraceSynchronousWorkEnd(18) { Work = 2 }
TraceOperationEnd(15) { TaskID = 1, Status = 1 }
RunningContinuation(20) { TaskID = 1, Object = 0 }
TaskCompleted(9) { Message = "Task 1 completed.", OriginatingTaskSchedulerID = 1, OriginatingTaskID = 0, TaskID = 1, IsExceptional = False }

For more information check:

  • Async Causality Chain Tracking article by Andrew Stasyuk
  • A Few Words on Task.Id (and TaskScheduler.Id) article by Stephen Cleary.
  • How do I listen to TPL TaskStarted/TaskCompleted ETW events discussion on StackOverflow
  • System.Threading.Tasks.TplEtwProvider source code
like image 55
Leonid Vasilev Avatar answered Nov 04 '22 17:11

Leonid Vasilev