There as several similar questions that I have already read on Stack Overflow. Unfortunately, I lost links of all of them, because my browsing history got deleted unexpectedly.
All of the above questions, couldn't help me. Either, some of them have used CELERY or some of them SCRAPYD, and I want to use the MULTIPROCESSISNG Library. Also, the Scrapy Official Documentation shows how to run multiple spiders on a SINGLE PROCESS, not on MULTIPLE PROCESSES.
None of them couldn't help me, and hence I decided to ask this question.
After several try's, I came up with this code.
My Output-:
Enter a product to search for: apple
2015-06-27 14:34:15 [scrapy] INFO: Scrapy 1.0.0 started (bot: scrapybot)
2015-06-27 14:34:15 [scrapy] INFO: Scrapy 1.0.0 started (bot: scrapybot)
2015-06-27 14:34:15 [scrapy] INFO: Optional features available: ssl, http11
2015-06-27 14:34:15 [scrapy] INFO: Optional features available: ssl, http11
2015-06-27 14:34:15 [scrapy] INFO: Overridden settings: {}
2015-06-27 14:34:15 [scrapy] INFO: Overridden settings: {}
2015-06-27 14:34:15 [scrapy] INFO: Enabled extensions: CloseSpider, TelnetConsole, LogStats, CoreStats, SpiderState
2015-06-27 14:34:15 [scrapy] INFO: Enabled extensions: CloseSpider, TelnetConsole, LogStats, CoreStats, SpiderState
2015-06-27 14:34:15 [scrapy] INFO: Enabled downloader middlewares: HttpAuthMiddleware, DownloadTimeoutMiddleware, UserAgentMiddleware, RetryMiddleware, DefaultHeadersMiddleware, MetaRefreshMiddleware, HttpCompressionMiddleware, RedirectMiddleware, CookiesMiddleware, ChunkedTransferMiddleware, DownloaderStats
2015-06-27 14:34:15 [scrapy] INFO: Enabled spider middlewares: HttpErrorMiddleware, OffsiteMiddleware, RefererMiddleware, UrlLengthMiddleware, DepthMiddleware
2015-06-27 14:34:15 [scrapy] INFO: Enabled item pipelines:
2015-06-27 14:34:15 [scrapy] INFO: Spider opened
2015-06-27 14:34:15 [scrapy] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2015-06-27 14:34:15 [scrapy] DEBUG: Telnet console listening on 127.0.0.1:6023
2015-06-27 14:34:15 [scrapy] INFO: Enabled downloader middlewares: HttpAuthMiddleware, DownloadTimeoutMiddleware, UserAgentMiddleware, RetryMiddleware, DefaultHeadersMiddleware, MetaRefreshMiddleware, HttpCompressionMiddleware, RedirectMiddleware, CookiesMiddleware, ChunkedTransferMiddleware, DownloaderStats
2015-06-27 14:34:15 [scrapy] INFO: Enabled spider middlewares: HttpErrorMiddleware, OffsiteMiddleware, RefererMiddleware, UrlLengthMiddleware, DepthMiddleware
2015-06-27 14:34:15 [scrapy] INFO: Enabled item pipelines:
2015-06-27 14:34:15 [scrapy] INFO: Spider opened
2015-06-27 14:34:15 [scrapy] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2015-06-27 14:34:15 [scrapy] DEBUG: Telnet console listening on 127.0.0.1:6024
2015-06-27 14:34:15 [twisted] ERROR: Unhandled Error
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/twisted/python/log.py", line 88, in callWithLogger
return callWithContext({"system": lp}, func, *args, **kw)
File "/usr/lib/python2.7/dist-packages/twisted/python/log.py", line 73, in callWithContext
return context.call({ILogContext: newCtx}, func, *args, **kw)
File "/usr/lib/python2.7/dist-packages/twisted/python/context.py", line 118, in callWithContext
return self.currentContext().callWithContext(ctx, func, *args, **kw)
File "/usr/lib/python2.7/dist-packages/twisted/python/context.py", line 81, in callWithContext
return func(*args,**kw)
--- <exception caught here> ---
File "/usr/lib/python2.7/dist-packages/twisted/internet/posixbase.py", line 619, in _doReadOrWrite
why = selectable.doWrite()
File "/usr/lib/python2.7/dist-packages/twisted/internet/base.py", line 1117, in doWrite
"doWrite called on a %s" % reflect.qual(self.__class__))
exceptions.RuntimeError: doWrite called on a twisted.internet.tcp.Port
Unhandled Error
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/twisted/python/log.py", line 88, in callWithLogger
return callWithContext({"system": lp}, func, *args, **kw)
File "/usr/lib/python2.7/dist-packages/twisted/python/log.py", line 73, in callWithContext
return context.call({ILogContext: newCtx}, func, *args, **kw)
File "/usr/lib/python2.7/dist-packages/twisted/python/context.py", line 118, in callWithContext
return self.currentContext().callWithContext(ctx, func, *args, **kw)
File "/usr/lib/python2.7/dist-packages/twisted/python/context.py", line 81, in callWithContext
return func(*args,**kw)
--- <exception caught here> ---
File "/usr/lib/python2.7/dist-packages/twisted/internet/posixbase.py", line 619, in _doReadOrWrite
why = selectable.doWrite()
File "/usr/lib/python2.7/dist-packages/twisted/internet/base.py", line 1117, in doWrite
"doWrite called on a %s" % reflect.qual(self.__class__))
exceptions.RuntimeError: doWrite called on a twisted.internet.tcp.Port
2015-06-27 14:34:16 [twisted] ERROR: Unhandled Error
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/twisted/python/log.py", line 88, in callWithLogger
return callWithContext({"system": lp}, func, *args, **kw)
File "/usr/lib/python2.7/dist-packages/twisted/python/log.py", line 73, in callWithContext
return context.call({ILogContext: newCtx}, func, *args, **kw)
File "/usr/lib/python2.7/dist-packages/twisted/python/context.py", line 118, in callWithContext
return self.currentContext().callWithContext(ctx, func, *args, **kw)
File "/usr/lib/python2.7/dist-packages/twisted/python/context.py", line 81, in callWithContext
return func(*args,**kw)
--- <exception caught here> ---
File "/usr/lib/python2.7/dist-packages/twisted/internet/posixbase.py", line 619, in _doReadOrWrite
why = selectable.doWrite()
File "/usr/lib/python2.7/dist-packages/twisted/internet/base.py", line 1117, in doWrite
"doWrite called on a %s" % reflect.qual(self.__class__))
exceptions.RuntimeError: doWrite called on a twisted.internet.tcp.Port
Unhandled Error
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/twisted/python/log.py", line 88, in callWithLogger
return callWithContext({"system": lp}, func, *args, **kw)
File "/usr/lib/python2.7/dist-packages/twisted/python/log.py", line 73, in callWithContext
return context.call({ILogContext: newCtx}, func, *args, **kw)
File "/usr/lib/python2.7/dist-packages/twisted/python/context.py", line 118, in callWithContext
return self.currentContext().callWithContext(ctx, func, *args, **kw)
File "/usr/lib/python2.7/dist-packages/twisted/python/context.py", line 81, in callWithContext
return func(*args,**kw)
--- <exception caught here> ---
File "/usr/lib/python2.7/dist-packages/twisted/internet/posixbase.py", line 619, in _doReadOrWrite
why = selectable.doWrite()
File "/usr/lib/python2.7/dist-packages/twisted/internet/base.py", line 1117, in doWrite
"doWrite called on a %s" % reflect.qual(self.__class__))
exceptions.RuntimeError: doWrite called on a twisted.internet.tcp.Port
2015-06-27 14:34:17 [scrapy] DEBUG: Crawled (200) <GET http://bigbasket.com/ps/?q=apple> (referer: None)
hello, world
Current second: 17
Current microsecond: 546862
2015-06-27 14:34:17 [scrapy] DEBUG: Scraped from <200 http://bigbasket.com/ps/?q=apple>
{'outofstock_status': 'In Stock', 'offer': 'No additional offer available', 'mrp': 'Rs. 170', 'imageurl': 'http://bigbasket.com/media/uploads/p/s/10000007_18-fresho-apple-washington.jpg', 'product_link': 'http://bigbasket.com/pd/10000007/fresho-apple-washington-1-kg/', 'productname': 'Apple - Washington', 'current_price': 'Rs. 170'}
2015-06-27 14:34:17 [scrapy] DEBUG: Scraped from <200 http://bigbasket.com/ps/?q=apple>
{'outofstock_status': 'In Stock', 'offer': 'No additional offer available', 'mrp': 'Rs. 199', 'imageurl': 'http://bigbasket.com/media/uploads/p/s/10000003_7-fresho-apple-fuji.jpg', 'product_link': 'http://bigbasket.com/pd/10000003/fresho-apple-fuji-1-kg/', 'productname': 'Apple - Fuji', 'current_price': 'Rs. 199'}
2015-06-27 14:34:17 [scrapy] DEBUG: Scraped from <200 http://bigbasket.com/ps/?q=apple>
{'outofstock_status': 'In Stock', 'offer': 'No additional offer available', 'mrp': 'Rs. 229', 'imageurl': 'http://bigbasket.com/media/uploads/p/s/10000005_16-fresho-apple-royal-gala.jpg', 'product_link': 'http://bigbasket.com/pd/10000005/fresho-apple-royal-gala-1-kg/', 'productname': 'Apple - Royal Gala', 'current_price': 'Rs. 229'}
2015-06-27 14:34:17 [scrapy] DEBUG: Scraped from <200 http://bigbasket.com/ps/?q=apple>
{'outofstock_status': 'In Stock', 'offer': 'No additional offer available', 'mrp': 'Rs. 156.75', 'imageurl': 'http://bigbasket.com/media/uploads/p/s/205988_2-american-garden-vinegar-apple-cider.jpg', 'product_link': 'http://bigbasket.com/pd/205988/american-garden-vinegar-apple-cider-473-ml-bottle/', 'productname': 'Vinegar - Apple Cider', 'current_price': 'Rs. 156.75'}
2015-06-27 14:34:17 [scrapy] DEBUG: Scraped from <200 http://bigbasket.com/ps/?q=apple>
{'outofstock_status': 'In Stock', 'offer': 'No additional offer available', 'mrp': 'Rs. 151', 'imageurl': 'http://bigbasket.com/media/uploads/p/s/10000397_7-fresho-apple-green.jpg', 'product_link': 'http://bigbasket.com/pd/10000397/fresho-apple-green-500-gm/', 'productname': 'Apple - Green', 'current_price': 'Rs. 151'}
2015-06-27 14:34:17 [scrapy] DEBUG: Scraped from <200 http://bigbasket.com/ps/?q=apple>
{'outofstock_status': 'In Stock', 'offer': 'No additional offer available', 'mrp': 'Rs. 114', 'imageurl': 'http://bigbasket.com/media/uploads/p/s/229785_5-tropicana-100-juice-apple.jpg', 'product_link': 'http://bigbasket.com/pd/229785/tropicana-100-juice-apple-1-ltr-tetra/', 'productname': '100% Juice - Apple', 'current_price': 'Rs. 114'}
2015-06-27 14:34:17 [scrapy] DEBUG: Scraped from <200 http://bigbasket.com/ps/?q=apple>
{'outofstock_status': 'In Stock', 'offer': 'No additional offer available', 'mrp': 'Rs. 266', 'imageurl': 'http://bigbasket.com/media/uploads/p/s/40015763_1-mylife-vinegar-apple-cider.jpg', 'product_link': 'http://bigbasket.com/pd/40015763/mylife-vinegar-apple-cider-300-ml/', 'productname': 'Vinegar - Apple Cider', 'current_price': 'Rs. 266'}
2015-06-27 14:34:17 [scrapy] DEBUG: Scraped from <200 http://bigbasket.com/ps/?q=apple>
{'outofstock_status': 'In Stock', 'offer': 'No additional offer available', 'mrp': 'Rs. 175', 'imageurl': 'http://bigbasket.com/media/uploads/p/s/40015525_1-fresho-apple-chilli.jpg', 'product_link': 'http://bigbasket.com/pd/40015525/fresho-apple-chilli-1-kg/', 'productname': 'Apple - Chilli', 'current_price': 'Rs. 175'}
2015-06-27 14:34:17 [scrapy] DEBUG: Scraped from <200 http://bigbasket.com/ps/?q=apple>
{'outofstock_status': 'In Stock', 'offer': 'No additional offer available', 'mrp': 'Rs. 94.05', 'imageurl': 'http://bigbasket.com/media/uploads/p/s/229791_3-tropicana-juice-apple.jpg', 'product_link': 'http://bigbasket.com/pd/229791/tropicana-juice-apple-1-ltr-tetra/', 'productname': 'Juice - Apple', 'current_price': 'Rs. 94.05'}
2015-06-27 14:34:17 [scrapy] DEBUG: Scraped from <200 http://bigbasket.com/ps/?q=apple>
{'outofstock_status': 'In Stock', 'offer': 'No additional offer available', 'mrp': 'Rs. 93', 'imageurl': 'http://bigbasket.com/media/uploads/p/s/40015526_1-fresho-apple-chilli.jpg', 'product_link': 'http://bigbasket.com/pd/40015526/fresho-apple-chilli-500-gm/', 'productname': 'Apple - Chilli', 'current_price': 'Rs. 93'}
2015-06-27 14:34:17 [scrapy] DEBUG: Scraped from <200 http://bigbasket.com/ps/?q=apple>
{'outofstock_status': 'In Stock', 'offer': 'No additional offer available', 'mrp': 'Rs. 94.05', 'imageurl': 'http://bigbasket.com/media/uploads/p/s/265854_2-real-fruit-power-juice-apple.jpg', 'product_link': 'http://bigbasket.com/pd/265854/real-fruit-power-juice-apple-1-ltr-carton/', 'productname': 'Fruit Power Juice - Apple', 'current_price': 'Rs. 94.05'}
2015-06-27 14:34:17 [scrapy] DEBUG: Scraped from <200 http://bigbasket.com/ps/?q=apple>
{'outofstock_status': 'In Stock', 'offer': 'No additional offer available', 'mrp': 'Rs. 143.10', 'imageurl': 'http://bigbasket.com/media/uploads/p/s/252445_3-biotique-shampoo-and-conditioner-bio-green-apple.jpg', 'product_link': 'http://bigbasket.com/pd/252445/biotique-shampoo-and-conditioner-bio-green-apple-190-ml/', 'productname': 'Shampoo and Conditioner - Bio Green...', 'current_price': 'Rs. 143.10'}
2015-06-27 14:34:17 [scrapy] DEBUG: Scraped from <200 http://bigbasket.com/ps/?q=apple>
{'outofstock_status': 'In Stock', 'offer': 'No additional offer available', 'mrp': 'Rs. 250', 'imageurl': 'http://bigbasket.com/media/uploads/p/s/30006470_1-fresho-apple-fuji-premium.jpg', 'product_link': 'http://bigbasket.com/pd/30006470/fresho-apple-fuji-premium-1-kg/', 'productname': 'Apple Fuji Premium', 'current_price': 'Rs. 250'}
2015-06-27 14:34:17 [scrapy] DEBUG: Scraped from <200 http://bigbasket.com/ps/?q=apple>
{'outofstock_status': 'In Stock', 'offer': 'No additional offer available', 'mrp': 'Rs. 19', 'imageurl': 'http://bigbasket.com/media/uploads/p/s/282654_2-real-fruit-power-juice-apple.jpg', 'product_link': 'http://bigbasket.com/pd/282654/real-fruit-power-juice-apple-200-ml-carton/', 'productname': 'Fruit Power Juice - Apple', 'current_price': 'Rs. 19'}
2015-06-27 14:34:17 [scrapy] DEBUG: Scraped from <200 http://bigbasket.com/ps/?q=apple>
{'outofstock_status': 'In Stock', 'offer': 'No additional offer available', 'mrp': 'Rs. 14.25', 'imageurl': 'http://bigbasket.com/media/uploads/p/s/100535016_4-quaker-oats-strawberry-flavor-with-apple.jpg', 'product_link': 'http://bigbasket.com/pd/100535016/quaker-oats-strawberry-flavor-with-apple-40-gm-pouch/', 'productname': 'Oats - Strawberry Flavor with Apple', 'current_price': 'Rs. 14.25'}
2015-06-27 14:34:17 [scrapy] DEBUG: Scraped from <200 http://bigbasket.com/ps/?q=apple>
{'outofstock_status': 'In Stock', 'offer': 'No additional offer available', 'mrp': 'Rs. 12.60', 'imageurl': 'http://bigbasket.com/media/uploads/p/s/265705_1-appy-apple-juice-drink-classic.jpg', 'product_link': 'http://bigbasket.com/pd/265705/appy-apple-juice-drink-classic-200-ml-carton/', 'productname': 'Apple Juice Drink - Classic', 'current_price': 'Rs. 12.60'}
2015-06-27 14:34:17 [scrapy] DEBUG: Scraped from <200 http://bigbasket.com/ps/?q=apple>
{'outofstock_status': 'In Stock', 'offer': 'No additional offer available', 'mrp': 'Rs. 19', 'imageurl': 'http://bigbasket.com/media/uploads/p/s/40012961_2-candy-clouds-cotton-candy-orange-green-apple.jpg', 'product_link': 'http://bigbasket.com/pd/40012961/candy-clouds-cotton-candy-orange-green-apple-30-gm-cup/', 'productname': 'Cotton Candy - Orange & Green Apple', 'current_price': 'Rs. 19'}
2015-06-27 14:34:17 [scrapy] DEBUG: Scraped from <200 http://bigbasket.com/ps/?q=apple>
{'outofstock_status': 'In Stock', 'offer': 'No additional offer available', 'mrp': 'Rs. 96', 'imageurl': 'http://bigbasket.com/media/uploads/p/s/229945_1-real-activ-juice-apple.jpg', 'product_link': 'http://bigbasket.com/pd/229945/real-activ-juice-apple-1-ltr-carton/', 'productname': 'Activ Juice - Apple', 'current_price': 'Rs. 96'}
2015-06-27 14:34:17 [scrapy] DEBUG: Scraped from <200 http://bigbasket.com/ps/?q=apple>
{'outofstock_status': 'In Stock', 'offer': 'No additional offer available', 'mrp': 'Rs. 23.75', 'imageurl': 'http://bigbasket.com/media/uploads/p/s/286759_1-minute-maid-juice-apple.jpg', 'product_link': 'http://bigbasket.com/pd/286759/minute-maid-juice-apple-400-ml-bottle/', 'productname': 'Juice - Apple', 'current_price': 'Rs. 23.75'}
2015-06-27 14:34:17 [scrapy] DEBUG: Scraped from <200 http://bigbasket.com/ps/?q=apple>
{'outofstock_status': 'In Stock', 'offer': 'No additional offer available', 'mrp': 'Rs. 85.50', 'imageurl': 'http://bigbasket.com/media/uploads/p/s/40020508_1-fresho-freshly-baked-apple-pie.jpg', 'product_link': 'http://bigbasket.com/pd/40020508/fresho-freshly-baked-apple-pie-100-gm-pouch/', 'productname': 'Freshly Baked - Apple Pie', 'current_price': 'Rs. 85.50'}
2015-06-27 14:34:17 [scrapy] INFO: Closing spider (finished)
2015-06-27 14:34:17 [scrapy] INFO: Dumping Scrapy stats:
{'downloader/request_bytes': 222,
'downloader/request_count': 1,
'downloader/request_method_count/GET': 1,
'downloader/response_bytes': 54881,
'downloader/response_count': 1,
'downloader/response_status_count/200': 1,
'finish_reason': 'finished',
'finish_time': datetime.datetime(2015, 6, 27, 9, 4, 17, 621449),
'item_scraped_count': 20,
'log_count/DEBUG': 22,
'log_count/ERROR': 1,
'log_count/INFO': 7,
'response_received_count': 1,
'scheduler/dequeued': 1,
'scheduler/dequeued/memory': 1,
'scheduler/enqueued': 1,
'scheduler/enqueued/memory': 1,
'start_time': datetime.datetime(2015, 6, 27, 9, 4, 15, 879467)}
2015-06-27 14:34:17 [scrapy] INFO: Spider closed (finished)
2015-06-27 14:34:17 [scrapy] DEBUG: Crawled (200) <GET http://www.cromaretail.com/productsearch.aspx?txtSearch=apple&x=0&y=0> (referer: None)
hello, world
Current second: 17
Current microsecond: 734324
2015-06-27 14:34:17 [scrapy] DEBUG: Scraped from <200 http://www.cromaretail.com/productsearch.aspx?txtSearch=apple&x=0&y=0>
{'outofstock_status': 'In Stock', 'offer': 'No additional offer available', 'mrp': 'Rs. 31,800', 'imageurl': 'http://www.cromaretail.com/Images/Catalogue/Product/medium/178153.jpg', 'product_link': 'http://www.cromaretail.comApple-iPhone-4-16-GB-Unlocked-Mobile-Phone-(Black)-pc-19326-97.aspx', 'productname': 'Apple iPhone 4 16 GB Unlocked Mobile Phone (Black)', 'current_price': 'Rs. 26,999'}
2015-06-27 14:34:17 [scrapy] DEBUG: Scraped from <200 http://www.cromaretail.com/productsearch.aspx?txtSearch=apple&x=0&y=0>
{'outofstock_status': 'In Stock', 'offer': 'No additional offer available', 'mrp': '', 'imageurl': 'http://www.cromaretail.com/Images/Catalogue/Product/medium/180356.jpg', 'product_link': 'http://www.cromaretail.comApple-iPhone-5c-32-GB-GSM-Mobile-Phone-(White)-pc-20258-97.aspx', 'productname': 'Apple iPhone 5c 32 GB GSM Mobile Phone (White)', 'current_price': 'Rs. 53,500'}
2015-06-27 14:34:17 [scrapy] DEBUG: Scraped from <200 http://www.cromaretail.com/productsearch.aspx?txtSearch=apple&x=0&y=0>
{'outofstock_status': 'In Stock', 'offer': 'No additional offer available', 'mrp': 'Rs. 53,500', 'imageurl': 'http://www.cromaretail.com/Images/Catalogue/Product/medium/180360.jpg', 'product_link': 'http://www.cromaretail.comApple-iPhone-5s-16-GB-GSM-Mobile-Phone-(Grey)-pc-20262-97.aspx', 'productname': 'Apple iPhone 5s 16 GB GSM Mobile Phone (Grey)', 'current_price': 'Rs. 44,500'}
2015-06-27 14:34:17 [scrapy] DEBUG: Scraped from <200 http://www.cromaretail.com/productsearch.aspx?txtSearch=apple&x=0&y=0>
{'outofstock_status': 'In Stock', 'offer': 'No additional offer available', 'mrp': 'Rs. 53,500', 'imageurl': 'http://www.cromaretail.com/Images/Catalogue/Product/medium/180362.jpg', 'product_link': 'http://www.cromaretail.comApple-iPhone-5s-16-GB-GSM-Mobile-Phone-(Gold)-pc-20263-97.aspx', 'productname': 'Apple iPhone 5s 16 GB GSM Mobile Phone (Gold)', 'current_price': 'Rs. 44,500'}
2015-06-27 14:34:17 [scrapy] DEBUG: Scraped from <200 http://www.cromaretail.com/productsearch.aspx?txtSearch=apple&x=0&y=0>
{'outofstock_status': 'In Stock', 'offer': 'No additional offer available', 'mrp': 'Rs. 53,500', 'imageurl': 'http://www.cromaretail.com/Images/Catalogue/Product/medium/180376.jpg', 'product_link': 'http://www.cromaretail.comApple-iPhone-5s-16-GB-GSM-Mobile-Phone-(Silver)-pc-20277-97.aspx', 'productname': 'Apple iPhone 5s 16 GB GSM Mobile Phone (Silver)', 'current_price': 'Rs. 44,500'}
2015-06-27 14:34:17 [scrapy] DEBUG: Scraped from <200 http://www.cromaretail.com/productsearch.aspx?txtSearch=apple&x=0&y=0>
{'outofstock_status': 'In Stock', 'offer': 'No additional offer available', 'mrp': 'Rs. 31,500', 'imageurl': 'http://www.cromaretail.com/Images/Catalogue/Product/medium/180443.jpg', 'product_link': 'http://www.cromaretail.comApple-iPhone-4S-8-GB-GSM-Mobile-Phone-(Black)-pc-20318-97.aspx', 'productname': 'Apple iPhone 4S 8 GB GSM Mobile Phone (Black)', 'current_price': 'Rs. 16,990'}
2015-06-27 14:34:17 [scrapy] DEBUG: Scraped from <200 http://www.cromaretail.com/productsearch.aspx?txtSearch=apple&x=0&y=0>
{'outofstock_status': 'In Stock', 'offer': 'No additional offer available', 'mrp': 'Rs. 31,500', 'imageurl': 'http://www.cromaretail.com/Images/Catalogue/Product/medium/180444.jpg', 'product_link': 'http://www.cromaretail.comApple-iPhone-4S-8-GB-GSM-Mobile-Phone-(White)-pc-20319-97.aspx', 'productname': 'Apple iPhone 4S 8 GB GSM Mobile Phone (White)', 'current_price': 'Rs. 16,990'}
2015-06-27 14:34:17 [scrapy] DEBUG: Scraped from <200 http://www.cromaretail.com/productsearch.aspx?txtSearch=apple&x=0&y=0>
{'outofstock_status': 'In Stock', 'offer': 'No additional offer available', 'mrp': 'Rs. 53,500', 'imageurl': 'http://www.cromaretail.com/Images/Catalogue/Product/medium/185039.jpg', 'product_link': 'http://www.cromaretail.comApple-iPhone-5S-16-GB-GSM-Mobile-Phone-(Gold)-pc-23555-97.aspx', 'productname': 'Apple iPhone 5S 16 GB GSM Mobile Phone (Gold)', 'current_price': 'Rs. 49,999'}
2015-06-27 14:34:17 [scrapy] DEBUG: Scraped from <200 http://www.cromaretail.com/productsearch.aspx?txtSearch=apple&x=0&y=0>
{'outofstock_status': 'In Stock', 'offer': 'No additional offer available', 'mrp': 'Rs. 53,500', 'imageurl': 'http://www.cromaretail.com/Images/Catalogue/Product/medium/185802.jpg', 'product_link': 'http://www.cromaretail.comApple-iPhone-6-16-GB-GSM-Mobile-Phone-(Silver)-pc-23996-97.aspx', 'productname': 'Apple iPhone 6 16 GB GSM Mobile Phone (Silver)', 'current_price': 'Rs. 52,500'}
2015-06-27 14:34:17 [scrapy] DEBUG: Scraped from <200 http://www.cromaretail.com/productsearch.aspx?txtSearch=apple&x=0&y=0>
{'outofstock_status': 'In Stock', 'offer': 'No additional offer available', 'mrp': '', 'imageurl': 'http://www.cromaretail.com/Images/Catalogue/Product/medium/185805.jpg', 'product_link': 'http://www.cromaretail.comApple-iPhone-6-64-GB-GSM-Mobile-Phone-(Silver)-pc-23999-97.aspx', 'productname': 'Apple iPhone 6 64 GB GSM Mobile Phone (Silver)', 'current_price': 'Rs. 62,500'}
2015-06-27 14:34:17 [scrapy] DEBUG: Scraped from <200 http://www.cromaretail.com/productsearch.aspx?txtSearch=apple&x=0&y=0>
{'outofstock_status': 'In Stock', 'offer': 'No additional offer available', 'mrp': '', 'imageurl': 'http://www.cromaretail.com/Images/Catalogue/Product/medium/185808.jpg', 'product_link': 'http://www.cromaretail.comApple-iPhone-6-128-GB-GSM-(Silver)-pc-24002-97.aspx', 'productname': 'Apple iPhone 6 128 GB GSM (Silver)', 'current_price': 'Rs. 71,500'}
2015-06-27 14:34:17 [scrapy] DEBUG: Scraped from <200 http://www.cromaretail.com/productsearch.aspx?txtSearch=apple&x=0&y=0>
{'outofstock_status': 'In Stock', 'offer': 'No additional offer available', 'mrp': '', 'imageurl': 'http://www.cromaretail.com/Images/Catalogue/Product/medium/185880.jpg', 'product_link': 'http://www.cromaretail.comApple-iPhone-6-Plus-16-GB-GSM-(Space-Grey)-pc-24004-97.aspx', 'productname': 'Apple iPhone 6 Plus 16 GB GSM (Space Grey)', 'current_price': 'Rs. 62,500'}
2015-06-27 14:34:17 [scrapy] DEBUG: Scraped from <200 http://www.cromaretail.com/productsearch.aspx?txtSearch=apple&x=0&y=0>
{'outofstock_status': 'In Stock', 'offer': 'No additional offer available', 'mrp': '', 'imageurl': 'http://www.cromaretail.com/Images/Catalogue/Product/medium/185881.jpg', 'product_link': 'http://www.cromaretail.comApple-iPhone-6-Plus-16-GB-GSM-(Silver)-pc-24005-97.aspx', 'productname': 'Apple iPhone 6 Plus 16 GB GSM (Silver)', 'current_price': 'Rs. 62,500'}
2015-06-27 14:34:17 [scrapy] DEBUG: Scraped from <200 http://www.cromaretail.com/productsearch.aspx?txtSearch=apple&x=0&y=0>
{'outofstock_status': 'In Stock', 'offer': 'No additional offer available', 'mrp': 'Rs. 56,000', 'imageurl': 'http://www.cromaretail.com/Images/Catalogue/Product/medium/189360.jpg', 'product_link': 'http://www.cromaretail.comApple-iPhone-6-16-GB-(Gold)-pc-26002-97.aspx', 'productname': 'Apple iPhone 6 16 GB (Gold)', 'current_price': 'Rs. 53,499'}
2015-06-27 14:34:17 [scrapy] DEBUG: Scraped from <200 http://www.cromaretail.com/productsearch.aspx?txtSearch=apple&x=0&y=0>
{'outofstock_status': 'In Stock', 'offer': 'No additional offer available', 'mrp': '', 'imageurl': 'http://www.cromaretail.com/Images/Catalogue/Product/medium/189363.jpg', 'product_link': 'http://www.cromaretail.comApple-iPhone-6-64-GB-(Gold)-pc-26005-97.aspx', 'productname': 'Apple iPhone 6 64 GB (Gold)', 'current_price': 'Rs. 65,000'}
2015-06-27 14:34:17 [scrapy] DEBUG: Scraped from <200 http://www.cromaretail.com/productsearch.aspx?txtSearch=apple&x=0&y=0>
{'outofstock_status': 'In Stock', 'offer': 'No additional offer available', 'mrp': '', 'imageurl': 'http://www.cromaretail.com/Images/Catalogue/Product/medium/189364.jpg', 'product_link': 'http://www.cromaretail.comApple-iPhone-6-64-GB-(Grey)-pc-26006-97.aspx', 'productname': 'Apple iPhone 6 64 GB (Grey)', 'current_price': 'Rs. 65,000'}
2015-06-27 14:34:17 [scrapy] DEBUG: Scraped from <200 http://www.cromaretail.com/productsearch.aspx?txtSearch=apple&x=0&y=0>
{'outofstock_status': 'In Stock', 'offer': 'No additional offer available', 'mrp': '', 'imageurl': 'http://www.cromaretail.com/Images/Catalogue/Product/medium/189365.jpg', 'product_link': 'http://www.cromaretail.comApple-iPhone-6-64-GB-(Silver)-pc-26007-97.aspx', 'productname': 'Apple iPhone 6 64 GB (Silver)', 'current_price': 'Rs. 65,000'}
2015-06-27 14:34:17 [scrapy] DEBUG: Scraped from <200 http://www.cromaretail.com/productsearch.aspx?txtSearch=apple&x=0&y=0>
{'outofstock_status': 'In Stock', 'offer': 'No additional offer available', 'mrp': '', 'imageurl': 'http://www.cromaretail.com/Images/Catalogue/Product/medium/189366.jpg', 'product_link': 'http://www.cromaretail.comApple-iPhone-6-128-GB-(Gold)-pc-26008-97.aspx', 'productname': 'Apple iPhone 6 128 GB (Gold)', 'current_price': 'Rs. 74,000'}
2015-06-27 14:34:17 [scrapy] INFO: Closing spider (finished)
2015-06-27 14:34:17 [scrapy] INFO: Dumping Scrapy stats:
{'downloader/request_bytes': 259,
'downloader/request_count': 1,
'downloader/request_method_count/GET': 1,
'downloader/response_bytes': 16851,
'downloader/response_count': 1,
'downloader/response_status_count/200': 1,
'finish_reason': 'finished',
'finish_time': datetime.datetime(2015, 6, 27, 9, 4, 17, 764861),
'item_scraped_count': 18,
'log_count/DEBUG': 20,
'log_count/ERROR': 1,
'log_count/INFO': 7,
'response_received_count': 1,
'scheduler/dequeued': 1,
'scheduler/dequeued/memory': 1,
'scheduler/enqueued': 1,
'scheduler/enqueued/memory': 1,
'start_time': datetime.datetime(2015, 6, 27, 9, 4, 15, 930386)}
2015-06-27 14:34:17 [scrapy] INFO: Spider closed (finished)
<Deferred at 0x7f02a29b7c68 current result: None>
If you see my output correctly, initially some error comes and the program pauses just a bit before
2015-06-27 14:34:17 [scrapy] DEBUG: Crawled (200) <GET http://bigbasket.com/ps/?q=apple> (referer: None)
- this line in my output, and then runs, but also produces an output.
I am not able to figure out the following 2 things:
Please provide corrections in my code, and explanations to my 2 issues.
Please do help! Any answers, shall be well appreciated!
Thanks!
We use the CrawlerProcess class to run multiple Scrapy spiders in a process simultaneously. We need to create an instance of CrawlerProcess with the project settings. We need to create an instance of Crawler for the spider if we want to have custom settings for the Spider.
You can use the API to run Scrapy from a script, instead of the typical way of running Scrapy via scrapy crawl . Remember that Scrapy is built on top of the Twisted asynchronous networking library, so you need to run it inside the Twisted reactor. The first utility you can use to run your spiders is scrapy.
The key to running scrapy in a python script is the CrawlerProcess class. This is a class of the Crawler module. It provides the engine to run scrapy within a python script. Within the CrawlerProcess class code, python's twisted framework is imported.
Spiders are classes which define how a certain site (or a group of sites) will be scraped, including how to perform the crawl (i.e. follow links) and how to extract structured data from their pages (i.e. scraping items).
Scrapy is created with Twisted, and this framework already has its way of running multiple processes. There is nice question about this here. In your approach you are actually trying to marry two incompatible and competing libraries (Scrapy/Twisted + multiprocessing). This is probably not best idea, you can run into lots of problems with that.
If you would like to run Scrapy spiders in multiple processes it will much easier to just use Twisted. You could just read Twisted docs for spawnProcess
and other calls and try to those tools for your goal. For example here's quick and dirty implementation that runs two spiders in two processes
from twisted.internet import defer, protocol, reactor
import os
class SpiderRunnerProtocol(protocol.ProcessProtocol):
def __init__(self, d, inputt=None):
self.deferred = d
self.inputt = inputt
self.output = ""
self.err = ""
def connectionMade(self):
if self.inputt:
self.transport.write(self.inputt)
self.transport.closeStdin()
def outReceived(self, data):
self.output += data
def processEnded(self, reason):
print(reason.value)
print(self.err)
self.deferred.callback(self.output)
def errReceived(self, data):
self.err += data
def run_spider(cmd, *args, **kwargs):
d = defer.Deferred()
pipe = SpiderRunnerProtocol(d)
args = [cmd] + list(args)
env = os.environ.copy()
x = reactor.spawnProcess(pipe, cmd, args, env=env)
print(x.pid)
print(x)
return d
def print_out(result):
print(result)
d = run_spider("scrapy", "crawl", "reddit")
d = run_spider("scrapy", "crawl", "dmoz")
d.addCallback(print_out)
d.addCallback(lambda _: reactor.stop())
reactor.run()
There's a nice blog post explaining usage of Twisted subprocesses here
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With