Passing arguments to process.crawl in Scrapy python

Tags:

I would like to get the same result as this command line : scrapy crawl linkedin_anonymous -a first=James -a last=Bond -o output.json

My script is as follows :

Click to copy

import scrapy from linkedin_anonymous_spider import LinkedInAnonymousSpider from scrapy.crawler import CrawlerProcess from scrapy.utils.project import get_project_settings  spider = LinkedInAnonymousSpider(None, "James", "Bond") process = CrawlerProcess(get_project_settings()) process.crawl(spider) ## <-------------- (1) process.start()

I found out that process.crawl() in (1) is creating another LinkedInAnonymousSpider where first and last are None (printed in (2)), if so, then there is no point of creating the object spider and how is it possible to pass the arguments first and last to process.crawl()?

linkedin_anonymous :

Click to copy

from logging import INFO  import scrapy  class LinkedInAnonymousSpider(scrapy.Spider):     name = "linkedin_anonymous"     allowed_domains = ["linkedin.com"]     start_urls = []      base_url = "https://www.linkedin.com/pub/dir/?first=%s&last=%s&search=Search"      def __init__(self, input = None, first= None, last=None):         self.input = input  # source file name         self.first = first         self.last = last      def start_requests(self):         print self.first ## <------------- (2)         if self.first and self.last: # taking input from command line parameters                 url = self.base_url % (self.first, self.last)                 yield self.make_requests_from_url(url)      def parse(self, response): . . .

306

asked Dec 20 '15 15:12

yusuf

2 Answers

pass the spider arguments on the process.crawl method:

Click to copy

process.crawl(spider, input='inputargument', first='James', last='Bond')

answered Sep 28 '22 10:09

eLRuLL

You can do it the easy way:

Click to copy

from scrapy import cmdline  cmdline.execute("scrapy crawl linkedin_anonymous -a first=James -a last=Bond -o output.json".split())

answered Sep 28 '22 10:09

Manualmsdos

Related questions
                            
                                unpacking an array of arguments in php
                            
                                Huge memory usage of loading large dictionaries in memory
                            
                                Python regex string matching?
                            
                                Simultaneously replacing all values of a dictionary to zero python
                            
                                Python Gensim: how to calculate document similarity using the LDA model?
                            
                                Managing Tweepy API Search
                            
                                How do I make sans serif superscript or subscript text in matplotlib?
                            
                                Invalid parameter for sklearn estimator pipeline
                            
                                Python: Difference between kwargs.pop() and kwargs.get()
                            
                                How to calculate a Fourier series in Numpy?
                            
                                Python matplotlib superimpose scatter plots
                            
                                Return max of zero or value for a pandas DataFrame column
                            
                                How to disable special naming convention inspection of PEP 8 in PyCharm
                            
                                Removing header column from pandas dataframe
                            
                                How to drop columns which have same values in all rows via pandas or spark dataframe?
                            
                                How to access the last element in a Pandas series?
                            
                                Can I somehow "compile" a python script to work on PC without Python installed?
                            
                                Why does using `arg=None` fix Python's mutable default argument issue?
                            
                                Calculate the Cumulative Distribution Function (CDF) in Python
                            
                                Infinite horizontal line in Bokeh

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Passing arguments to process.crawl in Scrapy python

Tags:

python

scrapy

web-crawler

scrapy-spider

google-crawlers

yusuf

People also ask

2 Answers

eLRuLL

Manualmsdos

Recent Activity

Donate For Us