Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Scrapy: Login with Selenium webdriver, transfer cookies to spider object?

I was just wondering if there's any reasonable way to pass authentication cookies from webdriver.Firefox() instance to the spider itself? It would be helpful to perform some webdriver stuff and then go about scraping "business as usual". Something to the effect of:

def __init__(self):
    BaseSpider.__init__(self)
    self.selenium = webdriver.Firefox()

def __del__(self):
    self.selenium.quit()
    print self.verificationErrors

def parse(self, response):

    # Initialize the webdriver, get login page
    sel = self.selenium
    sel.get(response.url)
    sleep(3)

    ##### Transfer (sel) cookies to (self) and crawl normally??? #####
    ...
    ...
like image 290
dru Avatar asked Nov 04 '22 05:11

dru


1 Answers

Transfer Cookies from Selenium to Scrapy Spider

Scrapying File

from selenium import webdriver
driver=webdriver.Firefox()  
data=driver.get_cookies()
# write to temp file        
with open('cookie.json', 'w') as outputfile:
    json.dump(data, outputfile)
    driver.close()
    outputfile.close()

....

Spider

import os
if os.stat("cookie.json").st_size > 2:
    with open('./cookie.json', 'r') as inputfile:
        self.cookie = json.load(inputfile)
    inputfile.close()
like image 55
George Avatar answered Nov 09 '22 16:11

George