Weird problem here. I have a Python 3 script that runs 24/7 and uses Selenium and Firefox to go to a web page and every 5 minutes downloads a file from a download link (which I can't just download with urllib, or whatever, because even though the link address for the download file remains constant, the data in the file is constantly changing and is different every time the page is reloaded and also depending on the criteria specified). The script runs fine almost all the time but I can't get rid of this one error that pops up every once in a while which terminates the script. Here's the error:
Traceback (most recent call last):
File "/Users/Shared/ROTH_1/Folio/get_F_notes.py", line 248, in <module>
driver.get(search_url)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/selenium/webdriver/remote/webdriver.py", line 187, in get
self.execute(Command.GET, {'url': url})
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/selenium/webdriver/remote/webdriver.py", line 173, in execute
response = self.command_executor.execute(driver_command, params)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/selenium/webdriver/remote/remote_connection.py", line 349, in execute
return self._request(command_info[0], url, body=data)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/selenium/webdriver/remote/remote_connection.py", line 379, in _request
self._conn.request(method, parsed_url.path, body, headers)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/http/client.py", line 1090, in request
self._send_request(method, url, body, headers)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/http/client.py", line 1118, in _send_request
self.putrequest(method, url, **skips)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/http/client.py", line 966, in putrequest
raise CannotSendRequest(self.__state)
http.client.CannotSendRequest: Request-sent
And here is the part of my script where the problem comes in, specifically, the script hits the "except ConnectionRefusedError:" part and, as intended, prints out "WARNING 1 : ConnectionRefusedError: search page did not load; now trying again". However, I get the above error, I think, when the loop begins again and tries to "driver.get(search_url)" again. The script chokes at that point and gives me the above error.
I have researched this quite a bit and it seems possible that the script is trying to reuse the same connection from the first attempt. The fix seems to be to create a new connection. But that is all I have been able to gather and, I have no idea how to create a new connection with Selenium. Do you? Or, is there some other issue here?
search_url = 'https://www.example.com/download_page'
loop_get_search_page = 1
while loop_get_search_page < 7:
if loop_get_search_page == 6:
print('WARNING: tried to load search page 5 times; exiting script to try again later')
##### log out
try:
driver.find_element_by_link_text('Sign Out')
except NoSuchElementException:
print('WARNING: NoSuchElementException: Unable to find the link text for the "Sign Out" button')
driver.quit()
raise SystemExit
try:
driver.get(search_url)
except TimeoutException:
print('WARNING ', loop_get_search_page, ': TimeoutException: search page did not load; now trying again', sep='')
loop_get_search_page += 1
continue
except ConnectionRefusedError:
print('WARNING ', loop_get_search_page, ': ConnectionRefusedError: search page did not load; now trying again')
loop_get_search_page += 1
continue
else:
break
Just ran into this problem myself. In my case, I had another thread running on the side that was also making requests via WebDriver. Turns out WebDriver is not threadsafe.
Check out the discussion at Can Selenium use multi threading in one browser? and the links there for more context.
When I removed the other thread, everything started working as expected.
Is it possible that you're running every 5m in a new thread?
The only way I know of to "create a new connection" is to launch a new instance of the WebDriver. That can get slow if you're doing a lot of requests, but since you're only doing things every 5m, it shouldn't really affect your throughput. As long as you always clean up your WebDriver instance when your dl is done, this might be a good option for you.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With