Pickling Selenium Webdriver Objects

I want to serialize and store a selenium webdriver object so then I could use it later elsewhere in my code. I'm trying to use pickle to do this. If there is another way to save the state of a webdriver object, so I can bring it up again later, that'd be great (I can't just reload the url, since the websites I am looking at are javascript-heavy and the current page depends on what I've clicked on so far).

Currently, I have code like this.

import pickle
from selenium import webdriver

d = webdriver.PhantomJS()
p = pickle.dumps(d, pickle.HIGHEST_PROTOCOL)
# Stuff happens here.
new_driver = pickle.loads(p)
print new_driver.page_source.encode('utf-8', 'ignore')

When I run this, I get the following error (the error occurs when I print, not before):

    return self.driver.page_source.encode('utf-8', 'ignore')
File "/home/eric/dev/crawler-env/local/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 436, in page_source
    return self.execute(Command.GET_PAGE_SOURCE)['value']
File "/home/eric/dev/crawler-env/local/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 163, in execute
    response = self.command_executor.execute(driver_command, params)
File "/home/eric/dev/crawler-env/local/lib/python2.7/site-packages/selenium/webdriver/remote/remote_connection.py", line 349, in execute
    return self._request(url, method=command_info[0], data=data)
File "/home/eric/dev/crawler-env/local/lib/python2.7/site-packages/selenium/webdriver/remote/remote_connection.py", line 396, in _request
    response = opener.open(request)
File "/usr/lib/python2.7/urllib2.py", line 404, in open
    response = self._open(req, data)
File "/usr/lib/python2.7/urllib2.py", line 422, in _open
    '_open', req)
File "/usr/lib/python2.7/urllib2.py", line 382, in _call_chain
    result = func(*args)
File "/usr/lib/python2.7/urllib2.py", line 1214, in http_open
    return self.do_open(httplib.HTTPConnection, req)
File "/usr/lib/python2.7/urllib2.py", line 1184, in do_open
    raise URLError(err)
urllib2.URLError: <urlopen error [Errno 111] Connection refused>

Is it possible to serialize my webdriver objects? If not, what are my alternatives?


Upon further inspection, even if I do something like d.get(url) again instead of printing the page source, it gives me the same error. Does something happen to the webdriver object when it is pickled/unpickled?

I was able to pickle a selenium.webdriver.Remote object. Neither dill or pickle worked for me to serialize a selenium.webdriver.Chrome object, in which python creates and runs the browser process. However, they both worked if I (1) ran the standalone java selenium2 webserver, (2) in one process, create a selenium.webdriver.Remote connection to that server and pickle/dill that to a file, (3) In another process, unserialize the Remote instance and use it.

This led to being able to close the python process and then re-connect to the existing webdriver browser and issue new commands (could be from a different python script). If I close the selenium web browser then a new instance needs to be created from scratch.


import pickle
import selenium.webdriver

FILENAME = '/tmp/pickle'

opt = selenium.webdriver.chrome.options.Options()
capabilities = opt.to_capabilities()
driver = selenium.webdriver.Remote(command_executor=EXECUTOR, desired_capabilities=capabilities)
fp = open(FILENAME, 'wb')
pickle.dump(driver, fp)


import pickle

FILENAME = '/tmp/pickle'

driver = pickle.load(open(FILENAME, 'rb')
el = driver.find_element_by_id('lst-ib')

Note (2020-08-08): Pickling selenium in this way stopped working in the latest selenium (4.x). Pickle fails to pickle an internal socket object. One option is to add a 'selenium=3.141.0' item to the install_requires component in setup.py which still works for me.

