Writing a socket-based server in Python, recommended strategies?

Tags:

I was recently reading this document which lists a number of strategies that could be employed to implement a socket server. Namely, they are:

Serve many clients with each thread, and use nonblocking I/O and level-triggered readiness notification
Serve many clients with each thread, and use nonblocking I/O and readiness change notification
Serve many clients with each server thread, and use asynchronous I/O
serve one client with each server thread, and use blocking I/O
Build the server code into the kernel

Now, I would appreciate a hint on which should be used in CPython, which we know has some good points, and some bad points. I am mostly interested in performance under high concurrency, and yes a number of the current implementations are too slow.

So if I may start with the easy one, "5" is out, as I am not going to be hacking anything into the kernel.

"4" Also looks like it must be out because of the GIL. Of course, you could use multiprocessing in place of threads here, and that does give a significant boost. Blocking IO also has the advantage of being easier to understand.

And here my knowledge wanes a bit:

"1" is traditional select or poll which could be trivially combined with multiprocessing.

"2" is the readiness-change notification, used by the newer epoll and kqueue

"3" I am not sure there are any kernel implementations for this that have Python wrappers.

So, in Python we have a bag of great tools like Twisted. Perhaps they are a better approach, though I have benchmarked Twisted and found it too slow on a multiple processor machine. Perhaps having 4 twisteds with a load balancer might do it, I don't know. Any advice would be appreciated.

424

asked Mar 11 '09 11:03

Ali Afshar

2 Answers

asyncore is basically "1" - It uses select internally, and you just have one thread handling all requests. According to the docs it can also use poll. (EDIT: Removed Twisted reference, I thought it used asyncore, but I was wrong).

"2" might be implemented with python-epoll (Just googled it - never seen it before). EDIT: (from the comments) In python 2.6 the select module has epoll, kqueue and kevent build-in (on supported platforms). So you don't need any external libraries to do edge-triggered serving.

Don't rule out "4", as the GIL will be dropped when a thread is actually doing or waiting for IO-operations (most of the time probably). It doesn't make sense if you've got huge numbers of connections of course. If you've got lots of processing to do, then python may not make sense with any of these schemes.

For flexibility maybe look at Twisted?

In practice your problem boils down to how much processing you are going to do for requests. If you've got a lot of processing, and need to take advantage of multi-core parallel operation, then you'll probably need multiple processes. On the other hand if you just need to listen on lots of connections, then select or epoll, with a small number of threads should work.

185

answered Oct 14 '22 04:10

Douglas Leeder

How about "fork"? (I assume that is what the ForkingMixIn does) If the requests are handled in a "shared nothing" (other than DB or file system) architecture, fork() starts pretty quickly on most *nixes, and you don't have to worry about all the silly bugs and complications from threading.

Threads are a design illness forced on us by OSes with too-heavy-weight processes, IMHO. Cloning a page table with copy-on-write attributes seems a small price, especially if you are running an interpreter anyway.

Sorry I can't be more specific, but I'm more of a Perl-transitioning-to-Ruby programmer (when I'm not slaving over masses of Java at work)

Update: I finally did some timings on thread vs fork in my "spare time". Check it out:

http://roboprogs.com/devel/2009.04.html

Expanded: http://roboprogs.com/devel/2009.12.html

answered Oct 14 '22 04:10

Roboprog

Related questions
                            
                                How to convert timestamp into string in Python
                            
                                Celery: The module was not found
                            
                                How to upload the python packages to Nexus sonartype private repo
                            
                                ElementClickInterceptedException: element click intercepted: [duplicate]
                            
                                socket.gaierror: [Errno -2] Name or service not known | Python
                            
                                How to filter Python list while keeping filtered values zero
                            
                                Upload Base64 Image to S3 and return URL
                            
                                Differences between re.match, re.search, re.fullmatch [duplicate]
                            
                                SciKit-Learn Label Encoder resulting in error 'argument must be a string or number'
                            
                                Record voice with recorder.js and upload it to python-flask server, but WAV file is broken
                            
                                No module named 'socks'
                            
                                Error : Failed to create temp directory "C:\Users\user\AppData\Local\Temp\conda-<RANDOM>\"
                            
                                How do I plot a Keras/Tensorflow subclassing API model?
                            
                                Plotly: How to only show vertical and horizontal line (crosshair) as hoverinfo?
                            
                                E: Package 'python-pip' has no installation candidate
                            
                                Execution time difference between x += y and x = x + y [duplicate]
                            
                                ModuleNotFoundError: No module named 'jsonschema.compat'
                            
                                Dealing with a string containing multiple character encodings
                            
                                How to create a numpy record array from C
                            
                                setting the gzip timestamp from Python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Writing a socket-based server in Python, recommended strategies?

Tags:

python

asynchronous

sockets

network-programming

c10k

Ali Afshar

People also ask

2 Answers

Douglas Leeder

Roboprog

Recent Activity

Donate For Us