Persistence of urllib.request connections to a HTTP server

Question

I want to do some performance testing on one of our web servers, to see how the server handles a lot of persistent connections. Unfortunately, I'm not terribly familiar with HTTP and web testing. Here's the Python code I've got for this so far:

import http.client
import argparse
import threading


def make_http_connection():
    conn = http.client.HTTPConnection(options.server, timeout=30)
    conn.connect()


if __name__ == '__main__':
    parser = argparse.ArgumentParser()

    parser.add_argument("num", type=int, help="Number of connections to make (integer)")
    parser.add_argument("server", type=str, help="Server and port to connect to. Do not prepend \'http://\' for this")

    options = parser.parse_args()

    for n in range(options.num):
        connThread = threading.Thread(target = make_http_connection, args = ())
        connThread.daemon = True
        connThread.start()

    while True:
        try:
            pass
        except KeyboardInterrupt:
            break

My main question is this: How do I keep these connections alive? I've set a long timeout, but that's a very crude method and I'm not even sure it affects the connection. Would simply requesting a byte or two every once in a while do it?

(Also, on an unrelated note, is there a better procedure for waiting for a keyboard interrupt than the ugly while True: block at the end of my code?)

jfs · Accepted Answer

urllib.request doesn't support persistent connections. There is 'Connection: close' hardcoded in the code. But http.client partially supports persistent connections (including legacy http/1.0 keep-alive). So the question title might be misleading.

I want to do some performance testing on one of our web servers, to see how the server handles a lot of persistent connections. Unfortunately, I'm not terribly familiar with HTTP and web testing.

You could use an existing http testing tools such as slowloris, httperf instead of writing one yourself.

How do I keep these connections alive?

To close http/1.1 connection a client should explicitly specify Connection: close header otherwise the connection is considered persistent by the server (though it may close it at any moment and http.client won't know about it until it tries to read/write to the connection).

conn.connect() returns almost immediately and your thread ends. To force each thread to maintain an http connection to the server you could:

import time

def make_http_connection(*args, **kwargs):
    while True: # make new http connections
        h = http.client.HTTPConnection(*args, **kwargs)
        while True: # make multiple requests using a single connection
            try:
                h.request('GET', '/') # send request; make conn. on the first run
                response = h.getresponse()
                while True: # read response slooowly
                    b = response.read(1) # read 1 byte
                    if not b:
                       break
                    time.sleep(60) # wait a minute before reading next byte
                    #note: the whole minute might pass before we notice that 
                    #  the server has closed the connection already
            except Exception:
                break # make new connection on any error

Note: if the server returns 'Connection: close' then there is a single request per connection.

(Also, on an unrelated note, is there a better procedure for waiting for a keyboard interrupt than the ugly while True: block at the end of my code?)

To wait until all threads finish or KeyboardInterrupt happens you could:

while threads:
    try:
        for t in threads[:]: # enumerate threads
            t.join(.1) # timeout 0.1 seconds
            if not t.is_alive():
               threads.remove(t)
    except KeyboardInterrupt:
        break

Or something like this:

while threading.active_count() > 1:
    try:
        main_thread = threading.current_thread()
        for t in threading.enumerate(): # enumerate all alive threads
            if t is not main_thread:
               t.join(.1)
    except KeyboardInterrupt:
        break

The later might not work for various reasons e.g., if there are dummy threads such as threads that started in C extensions without using threading module.

concurrent.futures.ThreadPoolExecutor provides a higher abstraction level than threading module and it can hide some complexity.

Instead of thread per connection model you could open multiple connections concurrently in a single thread e.g., using requests.async or gevent directly.

Persistence of urllib.request connections to a HTTP server

Tags:

python

http

python-3.x

Kudzu

1 Answers

jfs

Recent Activity

Donate For Us

Persistence of urllib.request connections to a HTTP server

Tags:

python

http

python-3.x

Kudzu

1 Answers

jfs

Related questions

Recent Activity

Donate For Us