I'm trying to deploy a crawler with four spiders. One of the spiders uses XMLFeedSpider and runs fine from the shell and scrapyd, but the others use BaseSpider and all give this error when run in scrapyd, but run fine from the shell
TypeError: init() got an unexpected keyword argument '_job'
From what I've read this points to a problem with the init function in my spiders, but I cannot seem to solve the problem. I don't need an init function and if I remove it completely I still get the error!
My Spider looks like this
from scrapy import log
from scrapy.spider import BaseSpider
from scrapy.selector import XmlXPathSelector
from betfeeds_master.items import Odds
# Parameters
MYGLOBAL = 39
class homeSpider(BaseSpider):
name = "home"
#con = None
allowed_domains = ["www.myhome.com"]
start_urls = [
"http://www.myhome.com/oddxml.aspx?lang=en&subscriber=mysubscriber",
]
def parse(self, response):
items = []
traceCompetition = ""
xxs = XmlXPathSelector(response)
oddsobjects = xxs.select("//OO[OddsType='3W' and Sport='Football']")
for oddsobject in oddsobjects:
item = Odds()
item['competition'] = ''.join(oddsobject.select('Tournament/text()').extract())
if traceCompetition != item['competition']:
log.msg('Processing %s' % (item['competition'])) #print item['competition']
traceCompetition = item['competition']
item['matchDate'] = ''.join(oddsobject.select('Date/text()').extract())
item['homeTeam'] = ''.join(oddsobject.select('OddsData/HomeTeam/text()').extract())
item['awayTeam'] = ''.join(oddsobject.select('OddsData/AwayTeam/text()').extract())
item['lastUpdated'] = ''
item['bookie'] = MYGLOBAL
item['home'] = ''.join(oddsobject.select('OddsData/HomeOdds/text()').extract())
item['draw'] = ''.join(oddsobject.select('OddsData/DrawOdds/text()').extract())
item['away'] = ''.join(oddsobject.select('OddsData/AwayOdds/text()').extract())
items.append(item)
return items
I can put an use an init function in to the spider, but I get exactly the same error.
def __init__(self, *args, **kwargs):
super(homeSpider, self).__init__(*args, **kwargs)
pass
Why is this happening and how do I solve it?
The good answer was given by alecx :
My init function was :
def __init__(self, domain_name):
In order to work within an egg for scrapyd, it should be :
def __init__(self, domain_name, **kwargs):
considering you pass domain_name as mandatory argument
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With