Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Calling WSAGetLastError() from an IOCP thread return incorrect result

Tags:

c++

sockets

iocp

I have called WSARecv() which returned WSA_IO_PENDING. I have then sent an RST packet from the other end. The GetQueuedCompletionStatus() function which exists in another thread has returned FALSE as expected, but when I called WSAGetLastError() I got 64 instead of WSAECONNRESET.

So why WSAGetLastError() did not return WSAECONNRESET?


Edit:

I forgot to mention that when I call WSAGetLastError() directly after a failing WSARecv() (because of an RST packet being received), the error code returned is WSAECONNRESET and not 64.

So it looks like the error code returned depends on whether WSARecv() has failed directly after calling it, or has failed later when retrieving a completion packet.

like image 226
Tom Avatar asked Mar 08 '15 09:03

Tom


1 Answers

This is a generic issue with IOCP, you are making a low-level call to the TCP/IP driver stack. Which, as all drivers do in Windows, report failure with NTSTATUS error codes. The expected error here is STATUS_CONNECTION_RESET.

These native error codes need to be translated to a winapi error code. This translation is normally context-sensitive, it depends on what winapi library issued the driver command. In other words, you can only ever get a WSAECONNRESET error back if it was the Winsock library that did the translation. But that's not what happened in your program, it was GetQueuedCompletionStatus() that handled the error.

Which is a generic helper function that handles IOCP for any device driver. There is no context, the OVERLAPPED structure is not nearly enough to indicate how the I/O request got started. Turn to this KB article, it documents the default mapping from NTSTATUS error codes to winapi error codes. The mapping that GetQueuedCompletionStatus() uses. Relevant entries in the list are:

STATUS_NETWORK_NAME_DELETED          ERROR_NETNAME_DELETED
STATUS_LOCAL_DISCONNECT              ERROR_NETNAME_DELETED
STATUS_REMOTE_DISCONNECT             ERROR_NETNAME_DELETED
STATUS_ADDRESS_CLOSED                ERROR_NETNAME_DELETED
STATUS_CONNECTION_DISCONNECTED       ERROR_NETNAME_DELETED
STATUS_CONNECTION_RESET              ERROR_NETNAME_DELETED 

These were, ahem, not fantastic choices. Probably goes back to very early Windows, back when Lanman was the network layer of choice. WSAGetLastError() is pretty powerless to map ERROR_NETNAME_DELETED back to a WSA specific error, the NTSTATUS code was lost when GetQueuedCompletionStatus() set the "last error" code for the thread. So it doesn't, it just returns what it can.


What you'd expect is a WSAGetQueuedCompletionStatus() function so this error translation can happen correctly, using Winsock rules. There isn't one. These days I prefer to use the ultimate authority on how to write Windows code properly, the .NET Framework source as available from the Reference Source. I linked to the source for SocketAsyncEventArgs.CompletionCallback() method. Which contains the key:

// The Async IO completed with a failure.
// here we need to call WSAGetOverlappedResult() just so Marshal.GetLastWin32Error() will return the correct error.
bool success = UnsafeNclNativeMethods.OSSOCK.WSAGetOverlappedResult(
    m_CurrentSocket.SafeHandle,
    m_PtrNativeOverlapped,
    out numBytes,
    false,
    out socketFlags);
socketError = (SocketError)Marshal.GetLastWin32Error();

Or in other words, you have to make an extra call to WSAGetOverlappedResult() to get the proper return value from GetLastError(). This is not very intuitive :)

like image 197
4 revs Avatar answered Oct 17 '22 20:10

4 revs