I was just wondering if there's any reasonable way to pass authentication cookies from webdriver.Firefox() instance to the spider itself? It would be helpful to perform some webdriver stuff and then go about scraping "business as usual". Something to the effect of:
def __init__(self):
BaseSpider.__init__(self)
self.selenium = webdriver.Firefox()
def __del__(self):
self.selenium.quit()
print self.verificationErrors
def parse(self, response):
# Initialize the webdriver, get login page
sel = self.selenium
sel.get(response.url)
sleep(3)
##### Transfer (sel) cookies to (self) and crawl normally??? #####
...
...
Scrapying File
from selenium import webdriver
driver=webdriver.Firefox()
data=driver.get_cookies()
# write to temp file
with open('cookie.json', 'w') as outputfile:
json.dump(data, outputfile)
driver.close()
outputfile.close()
....
Spider
import os
if os.stat("cookie.json").st_size > 2:
with open('./cookie.json', 'r') as inputfile:
self.cookie = json.load(inputfile)
inputfile.close()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With