Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Error when crawl data: 'EPollReactor' object has no attribute '_handleSignals'

I am trying to crawl data from a list of URLs. I have already done with the code below and succeeded yesterday without any error.

But today, when I came back and ran the code again, there was an error raised: the 'EPollReactor' object has no attribute '_handleSignals' The detailed error

Below is my code:

class MySpider(scrapy.Spider):
    name = 'myspider'
    def start_requests(self):
        urls = df['Link']
        for index, url in enumerate(urls):
          yield scrapy.Request(url=url, meta={'Index':index,'Item': ''})

    def parse(self, response):
        Item = response.meta['Item']
        Index = response.meta['Index']
        content = ''
        for para in response.css('p::text').extract():
            Item = Item + para
        df.loc[Index,"Content"] = Item

process = CrawlerProcess()
process.crawl(MySpider)
process.start()

I searched but I don't really understand fully about this so I can not fix the error. Could you please help me to fix it?

Thanks

like image 328
Yến Phan Avatar asked Mar 05 '26 23:03

Yến Phan


2 Answers

Did you reinstall scrapy? I was having the same issue today - my code that worked previously was giving the error you described. It looks like the error has to do with one of scrapy's dependencies, the Twisted package. There was a new release of the Twisted package about 4 hours ago (Version 23.8.0) that seems to have some compatibility issues with scrapy. If you pip install scrapy and allow Twisted to be installed as a dependency, it will install the new version and throw this error. I solved it by doing

pip install Twisted==22.10.0

to install the previous release of Twisted and it solved my problems.

like image 125
Craig Avatar answered Mar 08 '26 17:03

Craig


Scrapy v. 2.10.1 - released (release notes) with the only following change aimed to fix this:

Marked Twisted >= 23.8.0 as unsupported. (issue 6024, issue 6026)

So updating scrapy to is's latest version (for now it's 2.10.1) should solve this for now.

If project require older version of scrapy - set twisted to older version as suggested in other answer.

like image 34
Georgiy Avatar answered Mar 08 '26 17:03

Georgiy