Our web application has several hundreds of users and is (as always) mainly consisting of legacy code.
After our move to Azure we were able to see and measure more metrics than before. We are experiencing performance issues and I see that our garbage collection is going through the roof (As measured in the diagnostics tab of the web app, under the performance counters section). In one minute we were able to have these numbers:
And this for a mere 18580 HTTP requests, so on average we have:
These numbers are still rising, even though the amount of requests stays about the same (see graph)
My question / remarks are:
Thank you very much in advance, John
Update 1: 30/06/2018 @ 8:16 UTC+2
After updating application insights to monitor the garbage collection more closely I found a big eye-opener in performance hit. First of all, this is the average percent of time spent in GC:
It averages around 4,5 percent of the time (but there has been an inactive period during the night in this period) and the average peeks around 10% of the time. Then I thought to visualize the maximum amount of time the application was in GC mode and I almost fell of my chair:
This might be a wrong image. But it illustrates that our code has to wait for GC a lot! We really are going to have to fix this.
These numbers are suspiciously high for a typical web app. I'd say they are 10-1000x what they normally are (75110 GEN 2 in one minute. Sounds more like a micro benchmark for the GC :) ).
Is someone calling GC.Collect()
? Grep the source code for that.
Still, you need to find out if they are causing your perf problem. If they are not causing it then there is no need to fix this. Look at the time spent in GC counter. You can use PerfView to easily measure the GC pauses that are taken. That get's you an idea for the pause latency that customers are facing.
Are these numbers the amount of objects that the GC cleans, or the amount of times GC had to be active?
These are GCs, not objects.
Under a load like this, how much GC's would be concidered as "normal", knowing very well that none is the perfect answer, but practically...
"None" is certainly not the right answer. There is no point in saving GCs if it does not improve performance meaningfully. You spend dev time for nothing if you go after that goal. You can certainly take a "normal" number of GCs.
There is no way to give a normal number. It's more about time spend in GC (which is the overhead you have to pay for) and the G2 pause times that customers will take waiting for the page to load.
How is it possible that even if the amounts of request stays the same, the number of GC collections are rising like this?
Something horrible sits in your code I would say :) Maybe threads continuously spinning up calling GC.Collect()
?! A nightmare come true. Grep your code and report back. I will extend this answer to help your investigation.
Using PerfView or some profiler (I use JetBrains) you should be able to see where GCs are triggered in your code.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With