I am using scrapy
to crawl some websites. How to get the number of requests in the queue?
I have looked at the scrapy
source code and find scrapy.core.scheduler.Scheduler
may lead to my answer. See: https://github.com/scrapy/scrapy/blob/0.24/scrapy/core/scheduler.py
Two questions:
self.dqs
and self.mqs
mean in the scheduler class?This took me a while to figure out, but here's what I used:
self.crawler.engine.slot.scheduler
That is the instance of the scheduler. You can then call the __len__()
method of it, or if you just need true/false for pending requests, do something like this:
self.crawler.engine.scheduler_cls.has_pending_requests(self.crawler.engine.slot.scheduler)
Beware that there could still be running requests even thought the queue is empty. To check how many requests are currently running use:
len(self.crawler.engine.slot.inprogress)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With