ReadTimeout: HTTPSConnectionPool(host='', port=443): Read timed out. (read timeout=10)

Question

I'm doing a webscraping on a site and sometimes when running the script I get this error:

ReadTimeout: HTTPSConnectionPool(host='...', port=443): Read timed out. (read timeout=10)

My code:

url = 'mysite.com'
all_links_page = []
page_one = requests.get(url, headers=getHeaders(), timeout=10)
sleep(2)
if page_one.status_code == requests.codes.ok:
    soup_one = BeautifulSoup(page_one.content.decode('utf-8'), 'lxml')
    page_links_one = soup_one.select("ul.product_list") 

    for links_one in page_links_one:
        for li in links_one.select("li"):
            all_links_page.append(li.a.get("href").strip())

The answers I found was not satisfactory

Vadim · Accepted Answer

I was helped by increasing the timeout, immediately set 120 seconds. It turned out that the response from the server comes within 40 seconds.

chitown88 · Answer

Why do you have the timeout parameter in there? I would just eliminate the timeout parameter. The reason you get that error is because you set it to 10 which says if you don't receive a response from the server in 10 seconds, raise and error. So it's not necessarily the server calling you out. If no timeout is specified explicitly, requests do not time out (at least on your end).

page_one = requests.get(url, headers=headers)  #< --- don't use the timeout parameter

Zeina · Answer

This exception might occurs due to timeout or the available memory:

The response from the server takes longer than the specified timeout. So to solve it you need to set a higher timeout.
The file your are trying to read is large and the socket buffer is not enough to handle it. So you can try increasing the buffer size based on your machine's capacity.

        import urllib3, socket
        from urllib3.connection import HTTPConnection
    
        HTTPConnection.default_socket_options = ( 
            HTTPConnection.default_socket_options + [
            (socket.SOL_SOCKET, socket.SO_SNDBUF, 1000000), #1MB in byte
            (socket.SOL_SOCKET, socket.SO_RCVBUF, 1000000)
        ])

ReadTimeout: HTTPSConnectionPool(host='', port=443): Read timed out. (read timeout=10)

Tags:

python

python-3.x

beautifulsoup

python-requests

web-scraping

JB_

Video Answer

3 Answers

Vadim

chitown88

Zeina

Recent Activity

Donate For Us

ReadTimeout: HTTPSConnectionPool(host='', port=443): Read timed out. (read timeout=10)

Tags:

python

python-3.x

beautifulsoup

python-requests

web-scraping

JB_

Video Answer

3 Answers

Vadim

chitown88

Zeina

Related questions

Recent Activity

Donate For Us