i am new to django. I am trying to run my scrapy spider through django view. My scrapy code works perfectly when i run through command prompt. but when I try to run it on django it fails. The error message: signal only works in main thread.
my code in the django view(The following)
from twisted.internet import reactor from scrapy.crawler import Crawler from scrapy.crawler import CrawlerProcess from scrapy import log, signals from Working.spiders.workSpider import WorkSpider from scrapy.settings import Settings from scrapy.utils.project import get_project_settings spider = WorkSpider(domain='scrapinghub.com') crawler = CrawlerProcess(Settings()) crawler.start() crawler.signals.connect(reactor.stop, signal=signals.spider_closed) crawler.configure() crawler.crawl(spider) crawler.start() log.start() reactor.run()
Please help me solve this. thank you
the error basically say that you are not in a main thread so signal is not handled.
switching from CrawlerProcess to CrawlerRunner solved the problem for me ( i guess in CrawlerRunner you are in the main thread ) http://doc.scrapy.org/en/latest/topics/api.html#scrapy.crawler.CrawlerRunner
hope this helps you
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With