Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

DCOM: How to close connection in server on client crash?

Tags:

c++

atl

dcom

I have a rather old project: DCOM client and server, both in C++\ATL, only Windows platform. Everything works fine: local and remote clients connect to server and work simultaneously without any problem.

But when remote client crashes or being killed by Task Manager or by "taskkill" command or power switch off - I have a problem. My server do not know anything about client crash and tries to send new events to all clients (also to already crashed). As result I have pause (server can not send data to already crashed client) and it's duration is proportional to the numbers of crashed remote clients. After 5 crashed clients pauses are so long that it is equal to completely server stop.

I know about DCOM "ping" mechanism (DCOM should disconnect clients that does not respond to "every 2 minutes ping" after 6 minutes of silence). And really, after 6 minutes of hang I have small period of normal work but then server is coming back to "paused" state.

What can I do with all of this? How to make DCOM "ping" works fine? If I will implement my own "ping" code is it possible to disconnect old DCOM clients connection manually? How to do it?

like image 655
Ezh Avatar asked Jan 25 '11 14:01

Ezh


1 Answers

I'm not sure about the DCOM ping system, but one option for you would be to simply farm off the notifications to a separate thread pool. This will help mitigate the effect of having a small number of blocking clients - you'll start having problems when there are too many though, of course.

The easy way to do this is to use QueueUserWorkItem - this will invoke the passed callback on the application's system thread pool. Assuming you're using a MTA, this is all you need to do:

static InfoStruct {
    IRemoteHost *pRemote;
    BSTR someData;
};

static DWORD WINAPI InvokeClientAsync(LPVOID lpInfo) {
  CoInitializeEx(COINIT_MULTITHREADED);

  InfoStruct *is = (InfoStruct *)lpInfo;
  is->pRemote->notify(someData);
  is->pRemote->Release();
  SysFreeString(is->someData);
  delete is;

  CoUninitialize();
  return 0;
}

void InvokeClient(IRemoteHost *pRemote, BSTR someData) {

  InfoStruct *is = new InfoStruct;
  is->pRemote = pRemote;
  pRemote->AddRef();

  is->someData = SysAllocString(someData);
  QueueUserWorkItem(InvokeClientAsync, (LPVOID)is, WT_EXECUTELONGFUNCTION);
}

If your main thread is in a STA, this is only slightly more complex; you just have to use CoMarshalInterThreadInterfaceInStream and CoGetInterfaceAndReleaseStream to pass the interface pointer between apartments:

static InfoStruct {
    IStream *pMarshalledRemote;
    BSTR someData;
};

static DWORD WINAPI InvokeClientAsync(LPVOID lpInfo) {
  CoInitializeEx(COINIT_MULTITHREADED); // can be STA as well

  InfoStruct *is = (InfoStruct *)lpInfo;
  IRemoteHost *pRemote;
  CoGetInterfaceAndReleaseStream(is->pMarshalledRemote, __uuidof(IRemoteHost), (LPVOID *)&pRemote);

  pRemote->notify(someData);
  pRemote->Release();
  SysFreeString(is->someData);
  delete is;

  CoUninitialize();

  return 0;
}

void InvokeClient(IRemoteHost *pRemote, BSTR someData) {
  InfoStruct *is = new InfoStruct;
  CoMarshalInterThreadInterfaceInStream(__uuidof(IRemoteHost), pRemote, &is->pMarshalledRemote);

  is->someData = SysAllocString(someData);
  QueueUserWorkItem(InvokeClientAsync, (LPVOID)is, WT_EXECUTELONGFUNCTION);
}

Note that error checking has been elided for clarity - you will of course want to error check all calls - in particular, you want to be checking for RPC_S_SERVER_UNAVAILABLE and other such network errors, and remove the offending clients.

Some more sophisticated variations you may want to consider include ensuring only one request is in-flight per client at a time (thus further reducing the impact of a stuck client) and caching the marshalled interface pointer in the MTA (if your main thread is a STA) - since I believe CoMarshalInterThreadInterfaceInStream may perform network requests, you'd ideally want to take care of it ahead of time when you know the client is connected, rather than risking blocking on your main thread.

like image 62
bdonlan Avatar answered Sep 29 '22 00:09

bdonlan