Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

After which type of Exceptions/Crashes does an Azure Cloud instance perform restarts?

As far as I remember, a role instance should automatically perform a restart after a crash/failure. In order to test that behaviour, I wrote an application that enforces an out-of-memory-exception and my application crashed. The role instance didn't perform a restart, because it was still running and ok - the instance just restartet the .NET runtime.

I'm trying to find out how an instance reacts on different errors. In my case, a restart wasn't necessary. What type of errors/exceptions (that I can enforce) would cause a complete restart of an instance? What type of errors/exceptions would kill an instance forever?

like image 999
ceran Avatar asked Jan 10 '12 08:01

ceran


1 Answers

The only reason that causes a role instance to be recycled (restarted) is when the Run method of RoleEntryPoint exits. This typically happens when you:

  1. Overrided Run() method, and
  2. Have an unhandled exception in your program code, that would cause the Run() method to exit

Your role however would recycle, but rather hang, when you have enabled IntelliTrace logs collection.

The default template for a WebRole does not override Run() method, thus leaving the default implementation, which is "Thread.Sleep(-1);". There is no (auto) event that would cause automatic role recycling of a WebRole. Unless you do something in your RoleEntryPoint, that would cause Run method to exit. This automatic recycling only happens with WorkerRole, which do implement the Run() method.

UPDATE 1 (acording to comment 1)

run-Methoded of a RoleEntryPoint faces an error

Not just an error, but such kind of error (i.e. an unhandled exception), which causes the Run() method to exit.

Moreover, you can't just override the Run() in your WebRole, becasue your RoleEntryPoint descendat lives in different app domain (even different process) then your web application (so it will have no idea about your application's exceptions). Read more about Full IIS hosting and process here.

So, for a web role you just have a web application in fully features IIS 7.0 / 7.5, which has no idea that this IIS is part of an Azure deployment. Global.asax is your place to manage unhandled web application errors in ASP.NET. Check out this question, the answer of which provides a good example for Application_Error() handler.

You could use the RequestRecycle static method of RoleEnvironment type to manually require role recycling in your Application_Error() method. However don't recommend you to do so. I don't see a good practice in restarting the web server because of an application error. You should implement good exception handling and error logging strategy, regulary examine your error logs and take actions to avoid crytical errors that would require server restart.

What is your original intent? To understand when a Role will be automacitally recycled, or to model your application such as to automatically recycle your role on error? If it is the latter, I suggest that you revise your business requirements/logic.

UPDATE 2

I can't talk from Neil's mouth, but "instance failure" is everything that can cause a running VM to hang. Instance in Windows Azure is a signle Virtual Machine that hosts your application's code (read this blog post for detailed explanaition on Hosted Service, Role, Instance). Your application runs in a Windows Server based OS. It is a virtual machine. Anything could happen - from hardware failure on the host, to a generic software/driver failure of the guest OS. It is not neccessary to be your code. So in case something happens which would cause a single VM to fail - this issue is automatically handled by the Windows Azure Fabric. If it neccessary - your code is automatically deployed to another virtual machine. And this happens automagically. You do nohting. Imagine a HDD breaks, or a memory module burns out, or a network interface stops responding - these are just a few simple issues that could cause a running VM to fail. This is an instance failure.

A failure in your code is something that you should take care of. Everything else - Windows Azure Fabric controller takes care of.

UPDATE 3

  1. What happens to an asp.net application in a webrole if an exception occurs and it is not handled? Will the application just hang in an undefined state ("broken") until I look for it or will it be terminated by the vm?

This question is totally out of scope! What happens to an asp.net application in a shared hosting account? Or in on-premise IIS installation? Application crash for the user whose actions caused the crash. The worst case app pool recycle. I have never seen a "hung" asp.net application. There is no such thing as "terminated asp.net application" or "broken". If it is a generic error that is caused during application startup or first request - the application will never be online. If it is error caused by some sequence of user actions - user will see an ugly error message and nothing more (unless you have appropriate Application_Error() handler in your Global.asax. I think it is enough explanations for a question having nothing to do with Azure.

  1. Can you think of a piece of .NET code in my application that could cause the crash of a whole web role or it is not possible with managed code (apart from an unknown bug in .NET)?

Are you kidding? This code will crash your web role and will force a recycling:

RoleEnvironment.RequestRecycle()

Please accept this question, as I don't think there is something missing. Plus it has answers to at least 4 more questions, added to the original one.

FINAL

There is no such thing as "kill the instance forever".

like image 111
7 revs, 3 users 90% Avatar answered Sep 27 '22 19:09

7 revs, 3 users 90%