Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to speed up python networking?

I am finding python networking slow.

I have a server (written in C). I tested it with my client (python). I could reach 2MB/s. It worried me so I checked this:

host1 (client): cat some_big_file | nc host2 9999

host2 (server): nc -l 0.0.0.0 9999 | pv > /dev/null

I reached something around 120MB/s (1Gb).

The server is not a bottleneck, we use it on production and it can handle more. But to be sure I copied simple python gevent server for tests. It looks like this:

  #!/usr/bin/env python
  from gevent.server import StreamServer
  from gevent.pool import Pool

  def handle(socket, address):
       while True:
           print socket.recv(1024)

  pool = Pool(20000)
  server = StreamServer(('0.0.0.0', 9999), handle, spawn=pool)
  server.serve_forever()

Next measure is to send from nc (host1) to gserver (host2).

host1: cat some_big_file | nc host2 9999 host2: ./gserver.py | pv > /dev/null

The output on host2: [ 101MB/s]. Not bad.

But still, when I use my python client, it's slow. I switched client to gevent .I've tried with several greenlets. 1, 10, 100, 1000 - it didn't help too much, I could reach 20MB/s with one python process or ~30MB/s for 2, 3, 4, 5 separate python processes, it's something, but still not so good). Still slow. I've rewritten the client to be dumb, like this:

#!/usr/bin/env python
import sys
import socket

c = socket.create_connection((sys.argv[1], sys.argv[2]))
while 1:
        c.send('xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\n')

With this approach I could reach 10MB/s. I also tried the approach with reading the whole big 2GB file to memory and send it, similar result.

I also tried to run python scripts as separate processes (using tmux). If I used 1 process I could reach 10MB/s, 2 processes 20MB/s, 3 prcesses 23MB/s, 4, 5, 6 processes didn't change anything (tested with gevent version and simple one).

Details: Python-2.7.3 Debian 7 - standard installation Machines are AWS instances, client is c1.medium and server is c3.xlarge. nc and iperf measured 1Gb/s between machines.

Questions:

  1. Why can I receive a lot of data quickly using python server (gevent server) but cannot send with the same speed even if C program can.
  2. Why does doubling the processes not increase sending speed to the limit, only to some value.
  3. Is there any way to send data fast in python using sockets?
like image 989
spinus Avatar asked Apr 23 '14 15:04

spinus


1 Answers

The problem is not really that networking is slow - python function calls have a lot of overhead. If you call connection.send a lot of times, you're going to have a lot of wasted CPU time on function calls.

On my computer, your program averages about 35 MB/s. Doing a simple modification, I get 450 MB/s:

#...
c.send('xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'*10+'\n')

I could reach speeds over 1GB/s, by sending even more data at once.

If you want to maximize your throughput, you should send as much data as possible in a single call to send. A simple way of doing it would be concatenating several strings before sending the final result. If you do this, remember python strings are immutable, so successive string concatenation (with large strings) is slow. You'll want to use a bytearray instead.

like image 67
loopbackbee Avatar answered Nov 01 '22 04:11

loopbackbee