Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Azure web app have slow and unwarm loads (30s+ load)

I have a very big problem with an Azure Webapp and would like to hear suggestions.

What we experience:

When I goto our website it's quite fast. The average load is around 1 second and respond as expected.

However, once pr. 10-20 minutes, we get a very cold load - where it's 30-60 seconds+.

This would make sense if the website didn't have visitors and it was cold, but we have 10+ visitors online at the website all business hours with 3-5 requests pr minute at a minimum.

This extra load is completely unacceptable of course.

Any ideas?

Our setup:

We have two azure web-apps. One for production and one for develop.

The production is a "STANDARD SMALL", with auto scale when CPU hits 65-85%.

We database is S2 with 10 gb.

It's a quite simple standard ASP.NET MVC site with some texts, forms and a few remote connections.

The only "non-standard" is 3 million indexed pages that looks into a database (page load is around 1s). This receives a lot of visitors from Google. We also receive some crawls from Google as we have a sitemap with 3mio+ pages.

Data from monitor:

EDIT: data from new monitor.

enter image description here

Web-app:

enter image description here

Database:

enter image description here

The configurations:

Production web-app:

enter image description here

Database:

enter image description here

Our attempts:

1: Always on.

We have tried always on multiple times, but then sometime within the first 30 min to 6 hours, the site just goes down and doesn't come back**. This is of course a huge problem and is not a solution.

2: Running on a VM.

We have a pretty stable and fine setup on a VM on Azure (4gb RAM) which worked ok. We had quite slow responses, but it worked decent. However, we would like to use the web app to "outsource" the scaling and platform to Azure - we just cannot accept this speed :)

**

It goes unresponsive forever until a timeout. I have tried two scenarioes: one where stopping and starting web-app worked, a second where I had to do a redeploy

like image 582
Lars Holdgaard Avatar asked Feb 15 '16 10:02

Lars Holdgaard


People also ask

Which Azure App Service feature helps alleviate problems associated with apps consuming more memory than expected?

Use AutoHeal. AutoHeal recycles the worker process for your app based on settings you choose (like configuration changes, requests, memory-based limits, or the time needed to execute a request). Most of the time, recycle the process is the fastest way to recover from a problem.

What is auto heal in Azure?

Auto-healing is a mitigation action you can take when your app is having unexpected behavior. You can set your own rules based on request count, slow request, memory limit, and HTTP status code to trigger mitigation actions. Use the tool to temporarily mitigate an unexpected behavior until you find the root cause.

How do I increase the timeout on my Azure Web App?

1 Answer. As you are aware, the 230 seconds is a timeout configured at the Azure App service load balancer. This is a part of the Azure App service architecture and cannot be configured or changed.


2 Answers

To help further isolate what the bottleneck is, could you please use the new Ibiza portal at http:portal.azure.com.

The older portal, screenshots above, is showing 54-minute averages. Obviously with 5-minute averages and the DTUs at ~80% there are likely to be periods where all of the DTUs are consumed and that could be the bottleneck.

Using the new portal, these DTU graphs are 15-second averages and this finer granularity could point to the bottleneck. Can you change to the new portal and paste some more graphs?

Thanks Guy

like image 65
guyhay_MSFT Avatar answered Oct 12 '22 23:10

guyhay_MSFT


I found a solution.

The solution wasn't just in one place, but in multiple places.

Let me try to dive in.

The main challenge was the 3 million pages we have indexed. Google crawls between 50-150k pages pr day, which we could see in Google Webmaster tools:

enter image description here

99.9% Of these pages were a unique type of address pages. I dived into these, and found out these took 1.5-2s on default (!). It was even slow when running towards test.

Step #1 was to make a new index and optimize the code. 5x Performance improvement there.

Now, that doesn't solve the issue by itself. I also upgraded both the database to the new S3... Didn't solve issue totally (but still better).

I also upgraded our Azure Web App to the 7gb version - and THEN things started to perform.

However, we still had a small issue every 30th minute. I went onto our VM and found an old console job that kept some content in order.... I paused that job.

Neither of these findings could stand alone - but after all these were fixed - we're good again and website responds acceptable!

Hurray!

like image 32
Lars Holdgaard Avatar answered Oct 12 '22 23:10

Lars Holdgaard