A few users of my software have come to me recently, telling me that it doesn't work on Windows 8. After investigation it turns out that for some strange reason, my server socket doesn't always accept connections, but lets them time out.
Even stranger: it also happens when connecting to localhost, not just when accessing it remotely.
"What have you tried?"
Reminder: the exact same code works fine on Windows XP and Windows 7 and it affects localhost connections as well (not a hardware issue). Also, only a third of the connections fail, the rest works fine.
Okay, now some real code, since that's a lot more useful than all these words.
Socket setup:
int iResult;
struct addrinfo *result = NULL;
struct addrinfo hints;
ZeroMemory(&hints, sizeof(hints));
hints.ai_family = AF_INET;
hints.ai_socktype = SOCK_STREAM;
hints.ai_protocol = IPPROTO_TCP;
hints.ai_flags = AI_PASSIVE;
// "Resolve" our localhost
iResult = getaddrinfo(NULL, port, &hints, &result);
if (iResult != 0) {
printf("error (2) : %d\n", iResult);
return false;
}
// Create the socket
listenSocket = socket(result->ai_family, result->ai_socktype, result->ai_protocol);
if (listenSocket == INVALID_SOCKET) {
freeaddrinfo(result);
printf("error (3) : %d\n", WSAGetLastError());
return false;
}
// Bind it
iResult = bind(listenSocket, result->ai_addr, result->ai_addrlen);
if (iResult == SOCKET_ERROR) {
freeaddrinfo(result);
closesocket(listenSocket);
printf("error (4) : %d\n", WSAGetLastError());
return false;
}
freeaddrinfo(result);
// Listen
iResult = listen(listenSocket, SOMAXCONN);
if (iResult == SOCKET_ERROR) {
closesocket(listenSocket);
printf("%d\n", WSAGetLastError());
return false;
}
As you can probably see, it's almost directly taken from MSDN and should be fine. Besides, it works for 2/3 of the connections so I really doubt it's the setup code at fault.
The receiver code:
if (listenSocket == INVALID_SOCKET) return false;
#pragma warning(disable:4127)
fd_set fds;
SOCKET client;
do {
FD_ZERO(&fds);
FD_SET(listenSocket, &fds);
struct timeval timeout;
timeout.tv_sec = 5;
timeout.tv_usec = 0;
if (!select(1, &fds, NULL, NULL, &timeout)) continue; // See you next loop!
struct sockaddr_in addr;
socklen_t addrlen = sizeof(addr);
// Accept the socket
client = accept(listenSocket, (struct sockaddr *)&addr, &addrlen);
if (client == INVALID_SOCKET) {
printf("[HTTP] Invalid socket\n");
closesocket(listenSocket);
return false;
}
// Set a 1s timeout on recv()
struct timeval tv;
tv.tv_sec = 1;
tv.tv_usec = 0;
setsockopt(client, SOL_SOCKET, SO_RCVTIMEO, (char*)&tv, sizeof(tv));
// Receive the request
char recvbuf[513];
int iResult;
std::stringbuf buf;
clock_t end = clock() + CLOCKS_PER_SEC; // 1s from now
do {
iResult = recv(client, recvbuf, 512, 0);
if (iResult > 0) {
buf.sputn(recvbuf, iResult);
} else if (iResult == 0) {
// Hmm...
} else {
printf("[HTTP] Socket error: %d\n", WSAGetLastError());
break;
}
} while (!requestComplete(&buf) && clock() < end);
This code spits out a "[HTTP] Socket error: 10060" error, so any code that comes after it is fairly irrelevant.
The select
call is there because the actual loop does some other things as well, but I left it out because it's not socket-related.
Even stranger: Windows seems to be making actual network errors, according to Wireshark: http://i.imgur.com/BIrbD.png
I've been trying to figure this out for a while now, and I'm probably just doing something stupid, so I really appreciate all your answers.
I've been working on this annoying issue for an entire day now, and managed to eventually resolve it by rewriting the entire server from scratch and implementing it differently. I did trace the issue back to setsockopt
which doesn't seem to take SO_RCVTIMEO
very well anymore, causing the timeout to go to zero seconds which makes random connections time out.
My new implementation no longer uses a timeout, and is now simply non-blocking and asynchronous. Works very well but it takes a lot more code.
I'll assume that it's simply a bug in Windows 8 that will be fixed with an update before it's released. I doubt that Microsoft wanted to change the Berkeley Sockets API like this.
In Windows, SO_RCVTIMEO option requieres DWORD argument in MILLISECONDS, but not a timeval structure. See http://msdn.microsoft.com/en-us/library/windows/desktop/ms740476(v=vs.85).aspx.
Passing timeval causes windows to interpret it as DWORD, and seconds member is read as it is milliseconds. I don't know why timeval argument worked in Win prior 8, probably it was undocumented feature, which was removed in win 8.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With