Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the proper way for a Windows service to fail?

I have inherited a Windows service written in C#. Under rare conditions it fails badly. However, it isn't at all clear how to fail well. Ross Bennett states the problem elegantly at bytes.com. For the sake of simplicity I will just quote him here.

Ahoy, Folks!

I've been looking all over for this, but I just can't seem to shake any documentation out of the MSDN or from Google. I've reviewed every .NET article on developing Windows Services in the MSDN I've located.

I'm developing a Windows Service application. This service reads its configuration data from the system registry (HKLM) where it was deposited by another "manager" application. No problems there.

The service uses a worker thread to do its work. The thread is created in the OnStart() and signaled/joined/disposed in the OnStop(). Again, no problems.

Everything works beautifully when:

  1. The system administrator has set up everything properly, and
  2. the foreign network resources are all reachable.

But of course, we as developers simply can't rely on:

  1. The system administrator having set up everything properly, or
  2. the foreign network resources being reachable.

Really, what we need is for the service application to have some way of dying on its own. If a network resource goes down, we need the service to stop. But more to the point, we need the SCM to know it has stopped on its own accord. SCM needs to know that the service has "failed"...and hasn't just been shut down by someone.

Calling "return" or throwing an exception in the "OnStart()" method isn't even helpful for services still in the start-up process.. The SCM goes merrily on and the process keeps running in the Task Manager--though it's not actually doing anything since the worker thread was never created and started.

Using a ServiceController instance doesn't do it, either. That appears to the SCM as a normal shutdown--not a service failure. So none of the recovery actions or restarts happen. (Also, there is MSDNful documentation warning about the perils of a ServiceBase descendant using a ServiceController to make things happen with itself.)

I've read articles where people were messing about with PInvoking calls to the native code just to set the "Stopped" status flag in the SCM. But that doesn't shut down the process the service is running within.

I'd really like to know the Intended Way of:

  1. Shutting down a service from within the service, where
  2. The SCM is appropriatedly notified that the service has "Stopped", and
  3. The process disappears from the Task Manager.

Solutions involving ServiceControllers don't seem to be appropriate, if only because 2 is not satisfied. (That the Framework documentation specifically contraindicates doing that carries a good deal of weight, incidentally.)

I'd appreciate any recommendations, pointers to documentation, or even well-reasoned conjecture. :-) Oh! And I'm perfectly happy to entertain that I've missed the point.

Most cordially,

Ross Bennett

like image 364
J Edward Ellis Avatar asked Nov 16 '10 17:11

J Edward Ellis


People also ask

How do you fail a Windows service?

As an admin open the Windows Task Manager, in the Services tab find the service you want to test. Right click the service and click on Go to process. The selected process (if any) is the one corresponding to your service. Kill this process to simulate a service failure.

What causes failed to connect to a Windows service?

The possible causes of the “Failed to connect to a Windows service” error message. This error arises when a computer crashes after the rebooting process during Windows Updates. The computer restarts after the crash and reports an unexpected shutdown during Windows Updates.


1 Answers

Best practice in native code is to call SetServiceStatus with a non-zero exit code to indicate 1) it's stopped and 2) something went wrong.

In managed code, you could achieve the same effect by obtaining the SCM handle through the ServiceBase.ServiceHandle Property and P/Invoke-ing the Win32 API.

I don't see why the SCM would treat this any differently than setting the ServiceBase.ExitCode property non-zero and then calling ServiceBase.Stop, actually. P/Invoke is a bit more direct perhaps, if the service is in panic mode.


As noted in the comments (also see https://serverfault.com/questions/72318/set-up-recovery-actions-to-take-place-when-a-service-fails) if a process calls SetServiceStatus(SERVICE_STOPPED) with a non-zero exit code, the Recovery Actions for the serice will only be done if the option "Enable Actions For Stops With Errors" (sc.exe failureflag) is ticked. -> System Event ID 7024

If a service process exits (Env.Exit()) or crashs without consulting the SCM, then the Recovery Actions will always be run. -> System Event ID 7031

like image 154
Steve Townsend Avatar answered Oct 11 '22 05:10

Steve Townsend