I get twisted.internet.error.ReactorNotRestartable error when I execute following code:
from time import sleep from scrapy import signals from scrapy.crawler import CrawlerProcess from scrapy.utils.project import get_project_settings from scrapy.xlib.pydispatch import dispatcher  result = None  def set_result(item):     result = item  while True:     process = CrawlerProcess(get_project_settings())     dispatcher.connect(set_result, signals.item_scraped)      process.crawl('my_spider')     process.start()      if result:         break     sleep(3)   For the first time it works, then I get error. I create process variable each time, so what's the problem?
By default, CrawlerProcess's .start() will stop the Twisted reactor it creates when all crawlers have finished.
You should call process.start(stop_after_crawl=False) if you create process in each iteration.
Another option is to handle the Twisted reactor yourself and use CrawlerRunner. The docs have an example on doing that.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With