Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What could be the reason for such kind of Azure Web Site hangs?

I have a rather high-load deployment on Azure: 4 Large instances serving about 300-600 requests per second. Under normal conditions: "Average Response Time" is 70 to 150ms, but sometimes it may grow up to 200-300ms, but it's absolutely OK.

Though, one or two times per day (not at "Rush Hours") I see such picture on the Web Site Monitoring tab:

Azure Web Site Monitoring

So, number of requests per minute significantly drops, average response time is growing on to 3 minutes, and after a while – everything comes back to normal.

During this "Blackout" there is only 0.1% requests being dropped (Http Server Errors with timeout), other requests just wait in queue and are normally processed after few minutes. Though, not all clients are ready to wait :-(

Memory usage is under 30% all the time, CPU usage is only up to 40-50%.

What I've already checked?:

  1. Traces for timed-out requests: they did timed out at random locations.
  2. Throttling for Azure Storage and other components used: no throttling at all.
  3. I also tried to route all traffic through CloudFlare: and saw the same problems.

What could be the reason for such problems? What may I check next?
Thank you all in advance!

Update 1: BenV proposed good thing to try, but unfortunately it showed nothing :-(
I configured processes recycling every 500k requests and also added worker nodes, so CPU utilization is now less than 40% all day long, but blackouts still appear.

Update 2: Project uses ASP.Net MVC 4.

like image 726
Alexander Shvetsov Avatar asked Jul 31 '15 11:07

Alexander Shvetsov


People also ask

Which Azure App Service feature helps alleviate problems associated with apps consuming more memory than expected?

Use AutoHeal. AutoHeal recycles the worker process for your app based on settings you choose (like configuration changes, requests, memory-based limits, or the time needed to execute a request).

How do I increase Azure app timeout?

This timeout is not configurable, and this cannot be changed. Note that the idle timeout is at the TCP level which means that if the connection is idle only and no data transfer happening, then this timeout is hit.

How many requests can Azure app handle?

By default each Cloud Run container instance can receive up to 80 requests at the same time; you can increase this to a maximum of 1000. Although you should use the default value, if needed you can lower the maximum concurrency.


2 Answers

I had this exact same problem. For me I saw a lot of WinCache errors in my logs.

Whenever the site would fail, it would have a lot of WinCache errors in the log. WinCache is how IIS handles PHP to try to speed up the processing. It’s a Microsoft built add-on that is enabled by default in IIS and all Azure sites. WinCache would get hung up and instead of recycling and continuing, it would consume all the memory and file handles on an instance, essentially locking it up.

I added new App setting in the Azure Portal to scan a folder for php.ini settings changes.
d:\home\site\ini

Added a file in d:\home\site\ini\settings.ini that contains the following

wincache.fcenabled=1
session.save_handler = files
memory_limit = 256M
wincache.chkinterval=5
wincache.ucachesize=200
wincache.scachesize=64
wincache.enablecli=1
wincache.ocenabled=0 


This does a few things:
wincache.fcenabled=1

Enables file caching using WinCache (I think that's the default anyway)

session.save_handler = files

Changes the session handler from WinCache (Azure Default) to standard file based to reduce the cache engine stress

memory_limit = 256M
wincache.chkinterval=5
wincache.ucachesize=200
wincache.scachesize=64
wincache.enablecli=1

Sets the WinCache size to 256 megabytes per thread and limits the overall Cache size. This forces WinCache to clear out old data and recycle the cache more often.

wincache.ocenabled=0 

This is the big one. DISABLE WinCache Operational Code caching. That is WinCache caching the actual PHP scripts into memory. Files are still cached from line one, but PHP is interpreted per normal and not cached into large binary files.

I went from having a my Azure Website crash about once every 3 days with logs that look like yours to 120 days straight so far without any issues.

Good luck!

like image 176
greg_diesel Avatar answered Sep 25 '22 15:09

greg_diesel


There's some nice tools available for Web Apps in the preview portal.

Azure Web Apps tools menu

The Application Insights extension especially can be useful for monitoring and troubleshooting app performance.

enter image description here

like image 36
BenV Avatar answered Sep 22 '22 15:09

BenV