Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

can the infamous `ERROR_NETNAME_DELETED' error be considered an error at all? [duplicate]

I'm writing a tcp server in Windows NT using completion ports to exploit asynchronous I/O. I have a TcpSocket class, a TcpServer class and some (virtual functions) callbacks to call when an I/O operation is completed, e.g. onRead() for when a read is completed. I have also onOpen() for when the connection is established and onEof() for when the connection is closed, and so on. I always have a pending read for the socket, so if the socket effectively gets data (the read will be completed with size > 0) it calls onRead(), instead if the client closes the socket from the client side (the read will be completed with size == 0) it calls onEof(), and the server is aware of when the client closes the socket with closesocket(server_socket); from its side.

All works gracefully, but I have noticed a thing:

when i call closesocket(client_socket); on the server's side endpoint of the connection, instead of the client side, (either with setting linger {true, 0} or not), the pending read will be completed as erroneous, that is, the read size will not only be == 0, but also GetLastError() returns an error: 64, or 'ERROR_NETNAME_DELETED'. I have searched much about this on the web, but didn't find nothing interesting.

Then I asked myself: but is this a real error? I mean, can this really be considered an error?

The problem is that on the server side, the onError() callback will be called when I closesocket(client_socket); instead of the onEof(). So I thought this:

What about if I, when this 'ERROR_NETNAME_DELETED' "error" is received, call onEof() instead of onError() ? Would that introduce some bugs or undefined behavior? Another important point that made me ask this question is this:

When I have received this read completion with 'ERROR_NETNAME_DELETED', I have checked the OVERLAPPED structure, in particular the overlapped->Internal parameter which contain the NTSTATUS error code of the underlying driver. If we see a list of NTSTATUS error codes [ http://www.tenox.tc/links/ntstatus.html ] we can clearly see that the 'ERROR_NETNAME_DELETED' is generated by the NTSTATUS 0xC000013B, which is an error, but it is called 'STATUS_LOCAL_DISCONNECT'. Well, it doesn't look like a name for an error. It seems more like `ERROR_IO_PENDING' which is an error, but also a status for a correct behavior.

So what about checking the OVERLAPPED structure's Internal parameter, and when this is == to 'STATUS_LOCAL_DISCONNECT' a call to the onEof() callback is performed? Would mess things up?

In addition, I have to say that from the server side, if I call DisconnectEx() before calling closesocket(client_socket); I will not receive that error. But what about I don't want to call DisconnectEx() ? E.g. when the server is shutting down and doesn't want to wait all DisconnectEx() completions, but just want to close all client's connected.

like image 416
Marco Pagliaricci Avatar asked Jan 24 '13 10:01

Marco Pagliaricci


2 Answers

It's entirely up to you how you treat an error condition. In your case this error condition is entirely to be expected, and it's perfectly safe for you to treat it as an expected condition.

Another example of this nature is when you call an API function but don't know how large a buffer to provide. So you provide a buffer that you hope will be big enough. But if the API call fails, you then check that the last error is ERROR_INSUFFICIENT_BUFFER. That's an expected error condition. You can then try again with a larger buffer.

like image 119
David Heffernan Avatar answered Nov 16 '22 14:11

David Heffernan


It's up to you how to treat an error condition, but the question is a sign of potential problems in your code (from logic errors to undefined behavior).

The most important point is that you shouldn't touch SOCKET handle after closesocket. What do you do on EOF? It would be logical to closesocket on our side when we detect EOF, but that's what you cannot do in ERROR_NETNAME_DELETED handler, because closesocket already happened and the handle is invalid.

It's also profitable to imagine what happens if pending read completes (with real data available) just before closesocket, and your application detects it right after closesocket. You handle incoming data and... Do you send an answer to the client using the same socket handle? Do you schedule the next read on that handle? It would be all wrong, and there would be no ERROR_NETNAME_DELETED to tell you about it.

What happens if pending read completes with EOF in that very unfortunate moment, just before closesocket? If your regular OnEof callback is fired, and that callback does closesocket, it would be wrong again.

The problem you describe might hint at more serious problem if closesocket is done in one thread, while another thread waits for I/O completion. Are you sure that another thread is not calling WSARecv/ReadFile while the first thread is calling closesocket? That's undefined behavior, even though winsock makes it look as if it worked most of the time.

To summarize, the code handling completing (or failing) reads cannot be correct if it's unaware of socket handle being useless because it was closed. After closesocket, it's useful to wait for pending I/O completion because you can't reuse OVERLAPPED structure if you don't; but there's no point in handling this kind of completion as if it happened during normal operation, with socket being still open (error/status code is irrelevant).

like image 2
Anton Kovalenko Avatar answered Nov 16 '22 14:11

Anton Kovalenko