Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Saving items from Scrapyd to Amazon S3 using Feed Exporter

Using Scrapy with amazon S3 is fairly simple, you set:

  • FEED_URI = 's3://MYBUCKET/feeds/%(name)s/%(time)s.jl'
  • FEED_FORMAT = 'jsonlines'
  • AWS_ACCESS_KEY_ID = [access key]
  • AWS_SECRET_ACCESS_KEY = [secret key]

and everything works just fine.

But Scrapyd seems to override that setting and saves the items on the server (with a link in the web site)

Adding the "items_dir =" setting doesn't seem to change anything.

What kind of setting makes it work?

EDIT: Extra info that might be relevant - we are using Scrapy-Heroku.

like image 219
arikg Avatar asked Nov 13 '22 07:11

arikg


1 Answers

I also faced the same problem. Removing the items_dir= from scrapyd.conf file worked for me.

like image 85
pranavi dandu Avatar answered Dec 30 '22 23:12

pranavi dandu