Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

asyncio doesn't send the entire image data over tcp

I am trying to send an image from my local computer to a computer in the cloud using asyncio with TCP protocol. Sometimes I get the entire image being sent and sometimes only part of the image gets sent.

client code

import os
os.environ['PYTHONASYNCIODEBUG'] = '1'
import asyncio
import logging

logging.basicConfig(level=logging.ERROR)
async def tcp_echo_client(data, loop):
    reader, writer = await asyncio.open_connection(<ip_addr>, <port>,
                                                   loop=loop)
    print('Sending data of size: %r' % str(len(data)))
    writer.write(data)
    await writer.drain()
    #print("Message: %r" %(data))
    print(type(data))
    print('Close the socket')
    writer.write_eof()
    writer.close()

with open('sendpic0.jpg','rb') as f:
    data=f.read()
loop = asyncio.get_event_loop()
loop.run_until_complete(tcp_echo_client(data, loop))
loop.close()

server code:

import os
os.environ['PYTHONASYNCIODEBUG'] = '1'
import asyncio
import logging

logging.basicConfig(level=logging.ERROR)

async def handle_echo(reader, writer):

    data = await reader.read()
    addr = writer.get_extra_info('peername')
    #print("Received %r from %r" % (message, addr))
    print("Length of data recieved: %r" % (str(len(data))))
    #with open('recvpic0.jpg','wb') as f:
    #    f.write(data)
    print("Close the client socket")
    writer.close()
    #print("Message: %r" %(data))
    print("Received data of length: %r" %(str(len(data))))


loop = asyncio.get_event_loop()
data=b''
coro = asyncio.start_server(handle_echo, '', <port_number>, loop=loop)
server = loop.run_until_complete(coro)
print("Received data of length: %r" %(str(len(data))))
# Serve requests until Ctrl+C is pressed
print('Serving on {}'.format(server.sockets[0].getsockname()))
try:
    loop.run_forever()
except KeyboardInterrupt:
    pass

# Close the server
server.close()
loop.run_until_complete(server.wait_closed())
loop.close()

I didn't give the ip address and port number on purpose but it shouldn't matter.

Here is the output:

server output

Received data of length: '0'
Serving on ('0.0.0.0', 50001)

Length of data recieved: '249216'
Close the client socket
Received data of length: '249216'                                                                              

Length of data recieved: '250624'       
Close the client socket                                                                          
Received data of length: '250624'

Length of data recieved: '256403'                                                                              
Close the client socket                                                  
Received data of length: '256403'                                                                                              

client output

$ python client.py       
Sending data of size: '256403' 
Close the socket
$ python client.py
<class 'bytes'>                                                                               
Close the socket                                                                              
$ python client.py       
Sending data of size: '256403'                                                                
<class 'bytes'>                                                                               
Close the socket                                                                              

I am using Python 3.6.

I don't know if I am supposed to have a checking mechanism or send data in chunks? I would assume all that would happen automatically under the read function.

I adjusted the code from this website: http://asyncio.readthedocs.io/en/latest/tcp_echo.html

like image 899
ahat Avatar asked Jun 11 '18 01:06

ahat


1 Answers

This looks like the closing-writer bug described in detail in this article.

In short, writer.close is not a coroutine, so you cannot wait for close to actually flush the data from asyncio's buffer to the OS. Awaiting writer.drain() before the close() doesn't help because it only pauses until the background writes reduce the buffer size to a "low watermark", and not - as one might expect - until the buffer is emptied.

UPDATE: As of Python 3.7, released in June 2018, the straightforward fix is to await writer.wait_closed() at the end of tcp_echo_writer.


At the time the answer was originally written, the only available fix was to copy the implementation of asyncio.open_connection (not quite as bad as it sounds, since it's in essence a short a convenience function) and add a call to transport.set_write_buffer_limits(0). This will make await writer.drain() actually wait for all data to be written to the OS (which the referenced article argues would be the right thing to do anyway!):

@asyncio.coroutine
def fixed_open_connection(host=None, port=None, *,
                          loop=None, limit=65536, **kwds):
    if loop is None:
        loop = asyncio.get_event_loop()
    reader = asyncio.StreamReader(limit=limit, loop=loop)
    protocol = asyncio.StreamReaderProtocol(reader, loop=loop)
    transport, _ = yield from loop.create_connection(
        lambda: protocol, host, port, **kwds)
    ###### Following line added to fix buffering issues:
    transport.set_write_buffer_limits(0)
    ######
    writer = asyncio.StreamWriter(transport, protocol, reader, loop)
    return reader, writer

Weird that such a bug us hiding out in main asyncio library.

I suspect that most people don't see the bug because they keep the event loop running for a longer time doing other things, so after writer.close() the data eventually gets written out and the socket closed.

like image 94
user4815162342 Avatar answered Nov 12 '22 00:11

user4815162342