Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use select with Python ssl socket buffering?

My problem is similar to python - How select.select() works? . However, the solution there doesn't work for me, because I'm not open()ing my file. Instead, it's a socket. I couldn't find any way to set it to be unbuffered in the documentation.

I have a glib mainloop (which uses select), where I registered the socket for reading. Because socket.recv() requires me to specify a receive buffer size, it is not unusual to read fewer bytes than the socket read. As long as the kernel buffers them, that is fine; select will still mark the socket as "ready for reading". But apparently Python has a buffer as well. With large files, near the end of the data stream, recv() will read a part of it, the rest will be buffered by Python and select no longer triggers on my socket, until new data is sent. At that point, the "missing" data is received before the new data; no data is lost.

My question is: how do I solve this? Is there a way to disable Python's buffer on the socket? If not, is there a way to check if the buffer is empty, so I can make sure I don't return from my callback until it is?

Edit:

As noted in the comment, Python doesn't add an extra buffer to sockets, so this could not be the problem. I was unable to create a minimal example for the problem. However, it seems that it may be related to using ssl sockets. I had forgotten that I used an encrypted connection; disabling the encryption seems to solve this issue, but is not acceptable to me. So the above question remains, with the note that the buffers are probably implemented in the ssl module.

Example code to show the problem:

#!/usr/bin/python

import glib
import socket
import ssl

def cb (fd, cond):
    print ('data: %s' % repr (s.read (1)))
    return True

s = ssl.wrap_socket (socket.create_connection (('localhost', 1234)))
glib.io_add_watch (s.fileno (), glib.IO_IN, cb)
glib.MainLoop ().run ()

Then run a server with

openssl s_server -accept 1234 -key file.key -cert file.crt

Running the python program will establish the connection. Sending more than one byte of data will make the program print only the first byte; when sending more bytes, the remaining chunks are read first, then the first new byte, then it waits again. This is easy to understand: as long as there is data in the ssl buffer, the new byte is not read from the kernel buffer, so select continues to report it.

like image 642
Bas Wijnen Avatar asked Mar 28 '26 00:03

Bas Wijnen


2 Answers

Looking into the ssl source, I found an undocumented function which does what I want: pending(). It can be used like so:

#!/usr/bin/python

import glib
import socket
import ssl

def cb(fd, cond):
    print('data: %s' % repr(s.read(1)))
    while(s.pending()):
        print('more data: %s' % repr(s.read(1)))
    return True

s = ssl.wrap_socket (socket.create_connection(('localhost', 1234)))
glib.io_add_watch(s.fileno(), glib.IO_IN, cb)
glib.MainLoop().run()

This solves the problem.

like image 58
Bas Wijnen Avatar answered Mar 31 '26 08:03

Bas Wijnen


I ran into the same problem. The Python docs now have a note on this point, possibly in response to your bug report:

Conversely, since the SSL layer has its own framing, a SSL socket may still have data available for reading without select() being aware of it. Therefore, you should first call SSLSocket.recv() to drain any potentially available data, and then only block on a select() call if still necessary.

If you want to avoid blocking, the solution as you note seems to be to use the pending() function on the ssl.SSLSocket class.

Edit: I tried replacing a simple call to select in my code with the following (socks is a list which may include both normal and SSL sockets). The idea is to check first if there are any pending bytes on an SSL socket before blocking on select:

r = [s for s in socks if isinstance(s, ssl.SSLSocket) and s.pending()]
if not r:
    r, _, _ = select.select(socks, [], [], 1.0)

It seems to work as expected. My understanding is that:

  1. select can give false negatives for SLL sockets, but not false positives, and
  2. false negatives will only occur after a previous invocation returned true and the socket was read from (i.e. because some bytes in the SSL frame remained unread)

Assuming these are both true, I think the code above should service all sockets reliably and with no unnecessary waiting or blocking.

like image 39
deltacrux Avatar answered Mar 31 '26 08:03

deltacrux



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!