Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using Scrapy to parse sitemaps

I want to be able to use scrapy to crawl links on a sitemap. I don't know much about this application, so I would be interested in any links/info/documentation you could provide.

Thanks

like image 985
JBlake Avatar asked Nov 17 '25 14:11

JBlake


1 Answers

A new generic spider has just been added to Scrapy trunk, for this purpose. It will be available on next release (Scrapy 0.14)

  • Code here: http://snippets.scrapy.org/snippets/20/
  • Documentation here: http://readthedocs.org/docs/scrapy/en/latest/topics/spiders.html#sitemapspider
like image 131
Pablo Hoffman Avatar answered Nov 20 '25 03:11

Pablo Hoffman



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!