Python's <code>requests</code> library seems to be ~10x faster than C's <code>libcurl</code> (the C API, the CLI application, and the Python API) for a 1.6 MB request (<code>requests</code> takes 800ms, while <code>curl</code>/<code>libcurl</code> sometimes takes as much as 7 seconds). <ul> <li> Why is this? </li> <li> How can I get <code>curl</code> in C to run as fast as <code>requests</code> in Python? </li> </ul> <code>libcurl</code> seems to be getting its replies in 16KB chunks, while requests seems to get the whole thing at once, but I'm not sure this is it... I tried <code>curl_easy_setopt(curl_get, CURLOPT_BUFFERSIZE, 1<<19)</code> but it only seems good to get the buffer size to be smaller. I've tried looking at the source code for <code>requests</code>, and I think it uses <code>urllib3</code> as its HTTP "backend"... but using <code>urllib3</code> directly results in the same (disappointing) results as using <code>curl</code>. Here are some examples. <pre class="prettyprint"><code>/* gcc-8 test.c -o test -lcurl && t ./test */ #include <curl/curl.h> int main(){ CURLcode curl_st; curl_global_init(CURL_GLOBAL_ALL); CURL* curl_get = curl_easy_init(); curl_easy_setopt(curl_get, CURLOPT_URL, "https://api.binance.com/api/v3/exchangeInfo"); curl_easy_setopt(curl_get, CURLOPT_BUFFERSIZE, 1<<19); curl_st=curl_easy_perform(curl_get); if(curl_st!=CURLE_OK) printf("\x1b[91mFAIL \x1b[37m%s\x1b[0m\n", curl_easy_strerror(curl_st)); curl_easy_cleanup(curl_get); curl_global_cleanup(); } </code></pre> <pre class="prettyprint"><code>'''FAST''' import requests reply = requests.get('https://api.binance.com/api/v3/exchangeInfo') print(reply.text) </code></pre> <pre class="prettyprint"><code>'''SLOW''' import urllib3 pool = urllib3.PoolManager() # conn = pool.connection_from_url('https://api.binance.com/api/v3/exchangeInfo') reply = pool.request('GET', 'https://api.binance.com/api/v3/exchangeInfo') print(reply.data) print(len(reply.data)) </code></pre> <pre class="prettyprint"><code>'''SLOW!''' import urllib.request with urllib.request.urlopen('https://api.binance.com/api/v3/exchangeInfo') as response: html = response.read() </code></pre> <pre class="prettyprint"><code>'''SLOW!''' import pycurl from io import BytesIO buf = BytesIO() curl = pycurl.Curl() curl.setopt(curl.URL, 'https://api.binance.com/api/v3/exchangeInfo') curl.setopt(curl.WRITEDATA, buf) curl.perform() curl.close() body = buf.getvalue() # Body is a byte string. We have to know the encoding in order to print it to a text file such as standard output. print(body.decode('iso-8859-1')) </code></pre> <pre class="prettyprint"><code>curl https://api.binance.com/api/v3/exchangeInfo </code></pre>

One way of speeding up the transfer of web content is to use HTTP compression. This works by compressing data on the fly before it is sent between the server and client so it takes less time to transmit. Although HTTP compression is supported by libcurl, it is disabled by default:. From the <code>CURLOPT_ACCEPT_ENCODING</code> documentation: <blockquote> Set CURLOPT_ACCEPT_ENCODING to NULL to explicitly disable it, which makes libcurl not send an Accept-Encoding: header and not decompress received contents automatically. </blockquote> The default value of this parameter is NULL, so unless you specifically enable HTTP compression, you won't get it.

Why is Python's requests 10x faster than C's libcurl?

Q: Is Libcurl fast?

The results leaves no room for doubts: the libcurl-based version is significantly faster than the “native” alternatives. The PHP binding for libcurl, PHP/CURL, is a popular one.

Q: Why is PycURL fast?

PycURL is extremely fast (it is known to be much faster than Requests, which is a Python library for HTTP requests), has multiprotocol support, and also contains sockets for supporting network operations.

Q: Is PycURL faster than requests?

Speed - libcurl is very fast and PycURL, being a thin wrapper above libcurl, is very fast as well. PycURL was benchmarked to be several times faster than Requests. Features including multiple protocol support, SSL, authentication and proxy options.

Q: Does Python request use curl?

In Python, cURL transfers requests and data to and from servers using PycURL. PycURL functions as an interface for the libcURL library within Python. Almost every programming language can use REST APIs to access an endpoint hosted on a web server.

Tags:

python

c

curl

https

libcurl

Python's requests library seems to be ~10x faster than C's libcurl (the C API, the CLI application, and the Python API) for a 1.6 MB request (requests takes 800ms, while curl/libcurl sometimes takes as much as 7 seconds).

Why is this?
How can I get curl in C to run as fast as requests in Python?

libcurl seems to be getting its replies in 16KB chunks, while requests seems to get the whole thing at once, but I'm not sure this is it... I tried curl_easy_setopt(curl_get, CURLOPT_BUFFERSIZE, 1<<19) but it only seems good to get the buffer size to be smaller.

I've tried looking at the source code for requests, and I think it uses urllib3 as its HTTP "backend"... but using urllib3 directly results in the same (disappointing) results as using curl.

Here are some examples.

/*
gcc-8 test.c -o test -lcurl  &&  t ./test
*/
#include <curl/curl.h>

int main(){
  CURLcode curl_st;
  curl_global_init(CURL_GLOBAL_ALL);

  CURL* curl_get = curl_easy_init();
  curl_easy_setopt(curl_get, CURLOPT_URL,           "https://api.binance.com/api/v3/exchangeInfo");
  curl_easy_setopt(curl_get, CURLOPT_BUFFERSIZE, 1<<19);
  curl_st=curl_easy_perform(curl_get);  if(curl_st!=CURLE_OK) printf("\x1b[91mFAIL  \x1b[37m%s\x1b[0m\n", curl_easy_strerror(curl_st));

  curl_easy_cleanup(curl_get);
  curl_global_cleanup();
}

'''FAST'''
import requests
reply = requests.get('https://api.binance.com/api/v3/exchangeInfo')
print(reply.text)

'''SLOW'''
import urllib3
pool = urllib3.PoolManager()  # conn = pool.connection_from_url('https://api.binance.com/api/v3/exchangeInfo')
reply = pool.request('GET', 'https://api.binance.com/api/v3/exchangeInfo')
print(reply.data)
print(len(reply.data))

'''SLOW!'''
import urllib.request
with urllib.request.urlopen('https://api.binance.com/api/v3/exchangeInfo') as response:
  html = response.read()

'''SLOW!'''
import pycurl
from io import BytesIO
buf  = BytesIO()
curl = pycurl.Curl()
curl.setopt(curl.URL, 'https://api.binance.com/api/v3/exchangeInfo')
curl.setopt(curl.WRITEDATA, buf)
curl.perform()
curl.close()
body = buf.getvalue()  # Body is a byte string. We have to know the encoding in order to print it to a text file such as standard output.
print(body.decode('iso-8859-1'))

curl https://api.binance.com/api/v3/exchangeInfo

585

asked Jul 01 '21 09:07

étale-cohomology

1 Answers

One way of speeding up the transfer of web content is to use HTTP compression. This works by compressing data on the fly before it is sent between the server and client so it takes less time to transmit.

Although HTTP compression is supported by libcurl, it is disabled by default:. From the CURLOPT_ACCEPT_ENCODING documentation:

Set CURLOPT_ACCEPT_ENCODING to NULL to explicitly disable it, which makes libcurl not send an Accept-Encoding: header and not decompress received contents automatically.

The default value of this parameter is NULL, so unless you specifically enable HTTP compression, you won't get it.

165

answered Oct 20 '22 14:10

r3mainer

Related questions
                            
                                Plotly express bar chart colour change
                            
                                Updated to Python 3.8 - Terminal won't open - [Fixed] but details not understood
                            
                                Extracting Key-Phrases from text based on the Topic with Python
                            
                                How Can I Make My Bullets Look LIke They Are Comming Out Of My Guns Tip?
                            
                                Number of instances per class in pytorch dataset
                            
                                What does next() and iter() do in PyTorch's DataLoader()
                            
                                Is AWS boto (python) supporting SES signature version 4?
                            
                                Create sub cell in Spyder
                            
                                Pandas Dataframe replace part of string with value from another column
                            
                                X axis in Matplotlib print random numbers instead of the years
                            
                                Best way to specify nested dict with pydantic?
                            
                                Finding the width of the emoji using python3
                            
                                How do add an assembled field to a Pydantic model
                            
                                What is the safest way to queue multiple threads originating in a loop?
                            
                                removing loops with numpy.einsum
                            
                                Pygame Tic Tak Toe Logic? How Would I Do It
                            
                                Plotly: Create a Scatter with categorical x-axis jitter and multi level axis
                            
                                Regex for extracting names starting with Mr.|Mrs|The|DR after honorable
                            
                                Google Chrome cannot read and write to its data directory : selenium
                            
                                Unable to start Redis Queue (RQ) worker in Python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With