I'm working on adding OpenSSL support to my server program, and generally it's working pretty well, but I have come across a problem.
First, some background: The server is single-threaded and uses non-blocking I/O and a select() loop to handle multiple clients simultaneously. The server is linked to libssl.0.9.8.dylib and lib crypto.0.9.8.dylib (i.e. the libraries provided in /usr/lib by MacOS/X 10.8.5). The client<->server protocol is a proprietary full-duplex messaging protocol; that is, the clients and the server are all allowed to send and receive data at any time, and the client<->server TCP connections remain connected indefinitely (i.e. until the client or server decides to disconnect).
The issue is this: my clients can connect to the server, and sending and receiving data works fine (now that I got the SSL_ERROR_WANT_WRITE and SSL_ERROR_WANT_READ logic sorted out)… but if a the server accept()'s a new client connection while other clients are in the middle of sending or receiving data, the SSL layer seems to break. In particular, immediately after the server runs the SetupSSL() routine (shown below) to set up the newly-accepted socket, SSL_read() on one or more of the other (pre-existing) clients' sockets will return -1, and ERR_print_errors_fp(stderr) gives this output:
SSL_read() ERROR: 5673:error:140F3042:SSL routines:SSL_UNDEFINED_CONST_FUNCTION:called a function you should not call:/SourceCache/OpenSSL098/OpenSSL098-47.2/src/ssl/ssl_lib.c:2248:
After this error first appears, the server largely stops working. Data movement stops, and if I try to connect another client I often get this error:
SSL_read() ERROR: 5673:error:140760FC:SSL routines:SSL23_GET_CLIENT_HELLO:unknown protocol:/SourceCache/OpenSSL098/OpenSSL098-47.2/src/ssl/s23_srvr.c:578:
This happens about 25% of the time in my test scenario. If I make sure that my pre-existing client connections are idle (no data being sent or received) at the moment when the new client connects, it never happens. Does anyone know what might be going wrong here? Have I found an OpenSSL bug, or is there some detail that I'm overlooking? Some relevant code from my program is pasted below, in case it's helpful.
// Socket setup routine, called when the server accepts a new TCP socket
int SSLSession :: SetupSSL(int sockfd)
{
_ctx = SSL_CTX_new(SSLv23_method());
if (_ctx)
{
SSL_CTX_set_mode(_ctx, SSL_MODE_ENABLE_PARTIAL_WRITE);
_ssl = SSL_new(_ctx);
if (_ssl)
{
_sbio = BIO_new_socket(sockfd, BIO_NOCLOSE);
if (_sbio)
{
SSL_set_bio(_ssl, _sbio, _sbio);
SSL_set_accept_state(_ssl);
BIO_set_nbio(_sbio, !blocking);
ERR_print_errors_fp(stderr);
return RESULT_SUCCESS;
}
else fprintf(stderr, "SSLSession: BIO_new_socket() failed!\n");
}
else fprintf(stderr, "SSLSession: SSL_new() failed!\n");
}
else fprintf(stderr, "SSLSession: SSL_CTX_new() failed!\n");
return RESULT_FAILURE;
}
// Socket read routine -- returns number of bytes read from SSL-land
int32 SSLSession :: Read(void *buffer, uint32 size)
{
if (_ssl == NULL) return -1;
int32 bytes = SSL_read(_ssl, buffer, size);
if (bytes > 0)
{
_sslState &= ~(SSL_STATE_READ_WANTS_READABLE_SOCKET | SSL_STATE_READ_WANTS_WRITEABLE_SOCKET);
}
else if (bytes == 0) return -1; // connection was terminated
else
{
int err = SSL_get_error(_ssl, bytes);
if (err == SSL_ERROR_WANT_WRITE)
{
// We have to wait until our socket is writeable, and then repeat our SSL_read() call.
_sslState &= ~SSL_STATE_READ_WANTS_READABLE_SOCKET;
_sslState |= SSL_STATE_READ_WANTS_WRITEABLE_SOCKET;
bytes = 0;
}
else if (err == SSL_ERROR_WANT_READ)
{
// We have to wait until our socket is readable, and then repeat our SSL_read() call.
_sslState |= SSL_STATE_READ_WANTS_READABLE_SOCKET;
_sslState &= ~SSL_STATE_READ_WANTS_WRITEABLE_SOCKET;
bytes = 0;
}
else
{
fprintf(stderr, "SSL_read() ERROR: ");
ERR_print_errors_fp(stderr);
}
}
return bytes;
}
// Socket write routine -- returns number of bytes written to SSL-land
int32 SSLSession :: Write(const void *buffer, uint32 size)
{
if (_ssl == NULL) return -1;
int32 bytes = SSL_write(_ssl, buffer, size);
if (bytes > 0)
{
_sslState &= ~(SSL_STATE_WRITE_WANTS_READABLE_SOCKET | SSL_STATE_WRITE_WANTS_WRITEABLE_SOCKET);
}
else if (bytes == 0) return -1; // connection was terminated
else
{
int err = SSL_get_error(_ssl, bytes);
if (err == SSL_ERROR_WANT_READ)
{
// We have to wait until our socket is readable, and then repeat our SSL_write() call.
_sslState |= SSL_STATE_WRITE_WANTS_READABLE_SOCKET;
_sslState &= ~SSL_STATE_WRITE_WANTS_WRITEABLE_SOCKET;
bytes = 0;
}
else if (err == SSL_ERROR_WANT_WRITE)
{
// We have to wait until our socket is writeable, and then repeat our SSL_write() call.
_sslState &= ~SSL_STATE_WRITE_WANTS_READABLE_SOCKET;
_sslState |= SSL_STATE_WRITE_WANTS_WRITEABLE_SOCKET;
bytes = 0;
}
else
{
fprintf(stderr,"SSL_write() ERROR!");
ERR_print_errors_fp(stderr);
}
}
return bytes;
}
Someone on the openssl-users mailing list helped me figure this out; the problem was that I was setting up my SSL session with SSLv23_method(), and when using SSLv23_method(), you mustn't call SSL_pending() until after the SSL handshake has finished negotiating which protocol (SSLv2, SSLv3, TLSv1, etc) it's actually going to use.
Since my application doesn't require compatibility with older versions of SSL, the quick work-around for me is to call SSLv3_method() during setup instead of SSLv23_method(). If backwards compatibility was needed, then I'd need to figure out some way of detecting when the protocol negotiation had completed and avoid calling SSL_pending() until then; but I'm going to ignore the issue for now since I don't need that functionality.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With