Let's say I want to read a line from a socket, using the standard <code>socket</code> module: <pre class="prettyprint"><code>def read_line(s): ret = '' while True: c = s.recv(1) if c == '\n' or c == '': break else: ret += c return ret </code></pre> What exactly happens in <code>s.recv(1)</code>? Will it issue a system call each time? I guess I should add some buffering, anyway: <blockquote> For best match with hardware and network realities, the value of bufsize should be a relatively small power of 2, for example, 4096. </blockquote> http://docs.python.org/library/socket.html#socket.socket.recv But it doesn't seem easy to write efficient and thread-safe buffering. What if I use <code>file.readline()</code>? <pre class="prettyprint"><code># does this work well, is it efficiently buffered? s.makefile().readline() </code></pre>

The <code>recv()</code> call is handled directly by calling the C library function. It will block waiting for the socket to have data. In reality it will just let the <code>recv()</code> system call block. <code>file.readline()</code> is an efficient buffered implementation. It is not threadsafe, because it presumes it's the only one reading the file. (For example by buffering upcoming input.) If you are using the file object, every time <code>read()</code> is called with a positive argument, the underlying code will <code>recv()</code> only the amount of data requested, unless it's already buffered. It would be buffered if: <ul> <li>you had called readline(), which reads a full buffer</li> <li>the end of the line was before the end of the buffer</li> </ul> Thus leaving data in the buffer. Otherwise the buffer is generally not overfilled. The goal of the question is not clear. if you need to see if data is available before reading, you can <code>select()</code> or set the socket to nonblocking mode with <code>s.setblocking(False)</code>. Then, reads will return empty, rather than blocking, if there is no waiting data. Are you reading one file or socket with multiple threads? I would put a single worker on reading the socket and feeding received items into a queue for handling by other threads. Suggest consulting Python Socket Module source and C Source that makes the system calls.

Python sockets buffering

Tags:

Let's say I want to read a line from a socket, using the standard socket module:

def read_line(s):     ret = ''      while True:         c = s.recv(1)          if c == '\n' or c == '':             break         else:             ret += c      return ret

What exactly happens in s.recv(1)? Will it issue a system call each time? I guess I should add some buffering, anyway:

For best match with hardware and network realities, the value of bufsize should be a relatively small power of 2, for example, 4096.

http://docs.python.org/library/socket.html#socket.socket.recv

But it doesn't seem easy to write efficient and thread-safe buffering. What if I use file.readline()?

# does this work well, is it efficiently buffered? s.makefile().readline()

319

asked May 04 '09 20:05

Bastien Léonard

2 Answers

If you are concerned with performance and control the socket completely (you are not passing it into a library for example) then try implementing your own buffering in Python -- Python string.find and string.split and such can be amazingly fast.

def linesplit(socket):     buffer = socket.recv(4096)     buffering = True     while buffering:         if "\n" in buffer:             (line, buffer) = buffer.split("\n", 1)             yield line + "\n"         else:             more = socket.recv(4096)             if not more:                 buffering = False             else:                 buffer += more     if buffer:         yield buffer

If you expect the payload to consist of lines that are not too huge, that should run pretty fast, and avoid jumping through too many layers of function calls unnecessarily. I'd be interesting in knowing how this compares to file.readline() or using socket.recv(1).

197

answered Oct 19 '22 12:10

Aaron Watters

The recv() call is handled directly by calling the C library function.

It will block waiting for the socket to have data. In reality it will just let the recv() system call block.

file.readline() is an efficient buffered implementation. It is not threadsafe, because it presumes it's the only one reading the file. (For example by buffering upcoming input.)

If you are using the file object, every time read() is called with a positive argument, the underlying code will recv() only the amount of data requested, unless it's already buffered.

It would be buffered if:

you had called readline(), which reads a full buffer
the end of the line was before the end of the buffer

Thus leaving data in the buffer. Otherwise the buffer is generally not overfilled.

The goal of the question is not clear. if you need to see if data is available before reading, you can select() or set the socket to nonblocking mode with s.setblocking(False). Then, reads will return empty, rather than blocking, if there is no waiting data.

Are you reading one file or socket with multiple threads? I would put a single worker on reading the socket and feeding received items into a queue for handling by other threads.

Suggest consulting Python Socket Module source and C Source that makes the system calls.

answered Oct 19 '22 13:10

Joe Koberg

Related questions
                            
                                Java array with more than 4gb elements
                            
                                map.erase( map.end() )?
                            
                                Prompt on exit in PyQt application
                            
                                How to implement precompiled headers into your project
                            
                                Filtering only on Annotations in Django
                            
                                what is the state of the "C# compiler as a service " [closed]
                            
                                How do I pause main() until all other threads have died?
                            
                                Copy directory using Qt
                            
                                MIPS processors : Are they still in use? Which other architecture should I learn? [closed]
                            
                                Ask for confirm when closing a tab [closed]
                            
                                Real world examples of Rx [duplicate]
                            
                                Download a file with VBS

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With