Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ReadTimeout: HTTPSConnectionPool(host='', port=443): Read timed out. (read timeout=10)

I'm doing a webscraping on a site and sometimes when running the script I get this error:

ReadTimeout: HTTPSConnectionPool(host='...', port=443): Read timed out. (read timeout=10)

My code:

url = 'mysite.com'
all_links_page = []
page_one = requests.get(url, headers=getHeaders(), timeout=10)
sleep(2)
if page_one.status_code == requests.codes.ok:
    soup_one = BeautifulSoup(page_one.content.decode('utf-8'), 'lxml')
    page_links_one = soup_one.select("ul.product_list") 

    for links_one in page_links_one:
        for li in links_one.select("li"):
            all_links_page.append(li.a.get("href").strip())

The answers I found was not satisfactory

like image 771
JB_ Avatar asked Sep 18 '19 14:09

JB_


Video Answer


3 Answers

I was helped by increasing the timeout, immediately set 120 seconds. It turned out that the response from the server comes within 40 seconds.

like image 104
Vadim Avatar answered Oct 06 '22 13:10

Vadim


Why do you have the timeout parameter in there? I would just eliminate the timeout parameter. The reason you get that error is because you set it to 10 which says if you don't receive a response from the server in 10 seconds, raise and error. So it's not necessarily the server calling you out. If no timeout is specified explicitly, requests do not time out (at least on your end).

page_one = requests.get(url, headers=headers)  #< --- don't use the timeout parameter
like image 44
chitown88 Avatar answered Oct 06 '22 13:10

chitown88


This exception might occurs due to timeout or the available memory:

  • The response from the server takes longer than the specified timeout. So to solve it you need to set a higher timeout.
  • The file your are trying to read is large and the socket buffer is not enough to handle it. So you can try increasing the buffer size based on your machine's capacity.
        import urllib3, socket
        from urllib3.connection import HTTPConnection
    
        HTTPConnection.default_socket_options = ( 
            HTTPConnection.default_socket_options + [
            (socket.SOL_SOCKET, socket.SO_SNDBUF, 1000000), #1MB in byte
            (socket.SOL_SOCKET, socket.SO_RCVBUF, 1000000)
        ])

like image 2
Zeina Avatar answered Oct 06 '22 14:10

Zeina