Python Requests taking a long time

Question

Basically I am working on a python project where I download and index files from the sec edgar database. The problem however, is that when using the requests module, it take a very long time to save the text in a variable (between ~130 and 170 seconds for one file).

The file roughly has around 16 million characters, and I wanted to see if there was any way to easily lower the time it takes to retrieve the text. -- Example:

import requests

url ="https://www.sec.gov/Archives/edgar/data/0001652044/000165204417000008/goog10-kq42016.htm"

r = requests.get(url, stream=True)

print(r.text)

Thanks!

AlainChiasson · Accepted Answer

What I found is in the code for r.text, specifically when no encoding was given ( r.encoding == 'None' ). The time spend detecting the encoding was 20 seconds, I was able to skip it by defining the encoding.

...
r.encoding = 'utf-8' 
...

Additional details

In my case, my request was not returning an encoding type. The response was 256k in size, the r.apparent_encoding was taking 20 seconds.

Looking into the text property function. It tests to see if there is an encoding. If there is None, it will call the apperent_encoding function which will scan the text to autodetect the encoding scheme.

On a long string this will take time. By defining the encoding of the response ( as described above), you will skip the detection.

Validate that this is your issue

in your above example :

from datetime import datetime    
import requests

url = "https://www.sec.gov/Archives/edgar/data/0001652044/000165204417000008/goog10-kq42016.htm"

r = requests.get(url, stream=True)

print(r.encoding)

print(datetime.now())
enc = r.apparent_encoding
print(enc)

print(datetime.now())
print(r.text)
print(datetime.now())

r.encoding = enc
print(r.text)
print(datetime.now())

of course the output may get lost in the printing, so I recommend you run the above in an interactive shell, it may become more aparent where you are losing the time even without printing datetime.now()

Python Requests taking a long time

Tags:

python-requests

Jake Schurch

1 Answers

Additional details

Validate that this is your issue

AlainChiasson

Recent Activity

Donate For Us

Python Requests taking a long time

Tags:

python-requests

Jake Schurch

1 Answers

Additional details

Validate that this is your issue

AlainChiasson

Related questions

Recent Activity

Donate For Us