I have spider that I have written using the Scrapy framework. I am having some trouble getting any pipelines to work. I have the following code in my pipelines.py:
class FilePipeline(object):
def __init__(self):
self.file = open('items.txt', 'wb')
def process_item(self, item, spider):
line = item['title'] + '\n'
self.file.write(line)
return item
and my CrawlSpider subclass has this line to activate the pipeline for this class.
ITEM_PIPELINES = [
'event.pipelines.FilePipeline'
]
However when I run it using
scrapy crawl my_spider
I get a line that says
2010-11-03 20:24:06+0000 [scrapy] DEBUG: Enabled item pipelines:
with no pipelines (I presume this is where the logging should output them).
I have tried looking through the documentation but there doesn't seem to be any full examples of a whole project to see if I have missed anything.
Any suggestions on what to try next? or where to look for further documentation?
Got it! The line needs to go in the settings module for the project. Now it works!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With