Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does yahoo finance ban web scrapy or not?

The robots.txt in yahoo robots.txt say:

User-agent: *
Sitemap: https://finance.yahoo.com/sitemap_en-us_desktop_index.xml
Sitemap: https://finance.yahoo.com/sitemaps/finance-sitemap_index_US_en-US.xml.gz
Disallow: /r/
Disallow: /__rapidworker-1.2.js
Disallow: /__blank
Disallow: /_td_api
Disallow: /_remote

Does yahoo finance ban web scrapy or not?
What was disallowed by yahoo finance website?
What we can infer from yahoo's robots.txt file?

like image 309
showkey Avatar asked Oct 20 '25 18:10

showkey


1 Answers

Nothing in the robots.txt file expressly prevents you from scraping Yahoo Finance, however Yahoo finance is governed by Yahoo's Terms of Service.

The most pertinent part of this document says basically that you should not do anything which would interfere with their services. Realistically, this means that if you are planning on scraping Yahoo Finance for data, you should do so responsibly (not many thousands of requests, as this will quickly get you banned).

That said, web scraping is generally inefficient (as you are reloading an entire HTML page just to collect data programmatically). I would look into using an API instead (like those discussed here), as this will be a) more reliable b) faster and c) definitely be legal.

like image 177
Derek Brown Avatar answered Oct 25 '25 14:10

Derek Brown



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!