Why is Python 3 http.client so much faster than python-requests?

Tags:

I was testing different Python HTTP libraries today and I realized that http.client library seems to perform much much faster than requests.

To test it you can run following two code samples.

Click to copy

import http.client  conn = http.client.HTTPConnection("localhost", port=8000) for i in range(1000):     conn.request("GET", "/")     r1 = conn.getresponse()     body = r1.read()     print(r1.status)  conn.close()

and here is code doing same thing with python-requests:

Click to copy

import requests  with requests.Session() as session:     for i in range(1000):         r = session.get("http://localhost:8000")         print(r.status_code)

If I start SimpleHTTPServer:

Click to copy

> python -m http.server

and run above code samples (I'm using Python 3.5.2). I get following results:

http.client:

Click to copy

0.35user 0.10system 0:00.71elapsed 64%CPU

python-requests:

Click to copy

1.76user 0.10system 0:02.17elapsed 85%CPU

Are my measurements and tests correct? Can you reproduce them too? If yes does anyone know what's going on inside http.client that make it so much faster? Why is there such big difference in processing time?

459

asked Sep 11 '16 11:09

Pawel Miech

2 Answers

Based on profiling both, the main difference appears to be that the requests version is doing a DNS lookup for every request, while the http.client version is doing so once.

Click to copy

# http.client ncalls  tottime  percall  cumtime  percall filename:lineno(function)      1974    0.541    0.000    0.541    0.000 {method 'recv_into' of '_socket.socket' objects}      1000    0.020    0.000    0.045    0.000 feedparser.py:470(_parse_headers)     13000    0.015    0.000    0.563    0.000 {method 'readline' of '_io.BufferedReader' objects} ...  # requests ncalls  tottime  percall  cumtime  percall filename:lineno(function)      1481    0.827    0.001    0.827    0.001 {method 'recv_into' of '_socket.socket' objects}      1000    0.377    0.000    0.382    0.000 {built-in method _socket.gethostbyname}      1000    0.123    0.000    0.123    0.000 {built-in method _scproxy._get_proxy_settings}      1000    0.111    0.000    0.111    0.000 {built-in method _scproxy._get_proxies}     92000    0.068    0.000    0.284    0.000 _collections_abc.py:675(__iter__) ...

You're providing the hostname to http.client.HTTPConnection() once, so it makes sense it would call gethostbyname once. requests.Session probably could cache hostname lookups, but it apparently does not.

EDIT: After some further research, it's not just a simple matter of caching. There's a function for determining whether to bypass proxies which ends up invoking gethostbyname regardless of the actual request itself.

106

answered Oct 22 '22 01:10

Jason S

copy-pasting response from @Lukasa posted in python-requests repo:

The reason Requests is slower is because it does substantially more than httplib. httplib can be thought of as the bottom layer of the stack: it does the low-level wrangling of sockets. Requests is two layers further up, and adds things like cookies, connection pooling, additional settings, and kinds of other fun things. This is necessarily going to slow things down. We simply have to compute a lot more than httplib does.

You can see this by looking at cProfile results for Requests: there's just way more result than there is for httplib. This is always to be expected with high-level libraries: they add more overhead because they have to do a lot more work.

While we can look at targetted performance improvements, the sheer height of the call stack in all cases is going to hurt our performance markedly. That means that the complaint that "requests is slower than httplib" is always going to be true: it's like complaining that "requests is slower than sending carefully crafted raw bytes down sockets." That's true, and it'll always be true: there's nothing we can do about that.

answered Oct 22 '22 02:10

Pawel Miech

Related questions
                            
                                React.js - Creating simple table
                            
                                Using Vue.js in Laravel 5.3
                            
                                Using MySQL JSON field to join on a table
                            
                                Find dead code in Golang monorepo
                            
                                can a unique constraint column have 2 or more null values? (oracle)
                            
                                Windows SSTP VPN - connect from Mac [closed]
                            
                                How to use gensim BM25 ranking in python
                            
                                Generate all possible combinations for Columns in Google SpreadSheets
                            
                                How to restart containers in AWS ECS?
                            
                                Set React Input Field Value from JavaScript or JQuery
                            
                                Store lambda in a variable in kotlin
                            
                                How to set selected item in reactstrap Dropdown?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why is Python 3 http.client so much faster than python-requests?

Tags:

Pawel Miech

People also ask

2 Answers

Jason S

Pawel Miech

Recent Activity

Donate For Us