I can run a spider in scrapy with a simple command
scrapy crawl custom_spider -a input_val=5 -a input_val2=6
where input_val
and input_val2
are the values i'm passing to the spider
and the above method works fine..
However while scheduling a spider with scrapyd
running
curl http://localhost:6800/schedule.json -d project=crawler -d input_val=5 -d input_val2=6 -d spider=custom_spider
Throws an error
spider = cls(*args, **kwargs)
exceptions.TypeError: __init__() got an unexpected keyword argument '_job'
How do i get this to work?
Edit This: is inside my initializer:
def __init__(self,input_val=None, input_val2=None, *args, **kwargs):
self.input_val = input_val
self.input_val2 = input_val2
super(CustomSpider, self).__init__(*args, **kwargs)
Be sure to support arbitrary keyword arguments in your spider and call __init__
with super()
like shown in the docs for spider arguments:
class MySpider(scrapy.Spider):
name = 'myspider'
def __init__(self, category=None, *args, **kwargs):
super(MySpider, self).__init__(*args, **kwargs) # <- important
self.category = category
Scrapyd supplies the job ID as a _job
argument passed to the spider (see code here).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With