Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to pass multiple arguments to Scrapy spider (getting error running 'scrapy crawl' with more than one spider is no longer supported)?

Tags:

python

scrapy

I would like to pass multiple user-defined arguments to my scrapy spyder, so I tried to follow this post: How to pass a user defined argument in scrapy spider

However, when I follow the advice there I get an error:

root@ scrapy crawl dmoz -a address= 40-18 48th st -a borough=4
Usage
=====
  scrapy crawl [options] <spider>

crawl: error: running 'scrapy crawl' with more than one spider is no longer supported

I also tried with various permutations of quotation marks:

root@ scrapy crawl dmoz -a address= "40-18 48th st" -a borough="4"
Usage
=====
  scrapy crawl [options] <spider>
crawl: error: running 'scrapy crawl' with more than one spider is no longer supported

What is the correct way to pass parameters to the Scrapy spider? I would like to pass a username and password for the spider's login/scraping process. Thanks for any suggestions.

like image 590
sunny Avatar asked Jun 23 '15 07:06

sunny


People also ask

How do you run multiple spiders in a Scrapy?

We use the CrawlerProcess class to run multiple Scrapy spiders in a process simultaneously. We need to create an instance of CrawlerProcess with the project settings. We need to create an instance of Crawler for the spider if we want to have custom settings for the Spider.

How are arguments passed in Scrapy?

The spider will receive arguments in its constructor. Scrapy puts all the arguments as spider attributes and you can skip the init method completely. Beware use getattr method for getting those attributes so your code does not break.

Does Scrapy use LXML?

It uses lxml library under the hood, and implements an easy API on top of lxml API. It means Scrapy selectors are very similar in speed and parsing accuracy to lxml.

What is CrawlerProcess?

CrawlerProcess . This class will start a Twisted reactor for you, configuring the logging and setting shutdown handlers. This class is the one used by all Scrapy commands. Here's an example showing how to run a single spider with it. import scrapy from scrapy.crawler import CrawlerProcess class MySpider(scrapy.


1 Answers

No scrapy problem, I guess. It's how your shell interprets input, spliting tokens in spaces. So, you must not have any of them between the key and its value. Try with:

scrapy crawl dmoz -a address="40-18 48th st" -a borough="4"
like image 174
Birei Avatar answered Nov 14 '22 22:11

Birei