Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get the max historical price data from yahoo finance?

I want to get the max historical price data with scrapy from yahoo finance.
Here is url of fb(facebook) max historical price data.

https://query1.finance.yahoo.com/v7/finance/download/FNMA?period1=221115600&period2=1508472000&interval=1d&events=history&crumb=1qRuQKELxmM

In order to write a stock price web crawler ,two problems i can't solve.
1.How to get the argument period1 ?
You can get it by hand in the web page,just to click max.
How to get the argument with python codes?
Different stock has the different period1 value.

enter image description here

2.How to create the argument crumb=1qRuQKELxmM automatically ,different stocks with different crumb value?
Here is my stock max historical data with scrapy framework.

import scrapy

class TestSpider(scrapy.Spider):
    name = "quotes"
    allowed_domains = ["finance.yahoo.com"]

    def __init__(self, *args, **kw):
        self.timeout = 10

    def start_requests(self):
        stockName =  get-it and ommit the codes 
        for stock in stockName:
            period1 =  how to fill it
            crumb = how to fill it
            per_stock_max_data = "https://query1.finance.yahoo.com/v7/finance\
                  download/"+stock+"?period1="+period1+"&period2=1508472000&\
                  interval=1d&events=history&"+"crumb="crumb
            yield scrapy.Request(per_stock_max_data,callback=self.parse)

    def parse(self, response):
        content = response.body
        target = response.url
        #do something

How to fill the blank above in my web scrawler framework?

like image 907
showkey Avatar asked Dec 18 '22 03:12

showkey


2 Answers

As I understand you want to download all possible data for a specific ticker. So to do this you actually don't need to provide period1 parameter, if you provide 0 in the place of period1 then Yahoo API puts as default the oldest date.

To download quotes using the way you showed in the question we unfortunately have to deal with cookies. I will let myself provide solution without using Scrapy, only ticker itself is required:

def get_yahoo_ticker_data(ticker):
    res = requests.get('https://finance.yahoo.com/quote/' + ticker + '/history')
    yahoo_cookie = res.cookies['B']
    yahoo_crumb = None
    pattern = re.compile('.*"CrumbStore":\{"crumb":"(?P<crumb>[^"]+)"\}')
    for line in res.text.splitlines():
        m = pattern.match(line)
        if m is not None:
            yahoo_crumb = m.groupdict()['crumb']
    cookie_tuple = yahoo_cookie, yahoo_crumb

    current_date = int(time.time())
    url_kwargs = {'symbol': ticker, 'timestamp_end': current_date,
        'crumb': cookie_tuple[1]}
    url_price = 'https://query1.finance.yahoo.com/v7/finance/download/' \
                '{symbol}?period1=0&period2={timestamp_end}&interval=1d&events=history' \
                '&crumb={crumb}'.format(**url_kwargs)


    response = requests.get(url_price, cookies={'B': cookie_tuple[0]})

    return pd.read_csv(StringIO(response.text), parse_dates=['Date'])

If you really need the oldest date then you can use the code above and extract the first date from the response.

get_yahoo_ticker_data(ticker='AAPL')

I do know that web scraping is not an efficient option but it's the only option we have because Yahoo already decommissioned all APIs. You might find some third party solution but all of them use scraping inside their source code and they add some additional boiler plate code that decreases overall performance.

like image 137
Michael Dz Avatar answered Dec 27 '22 02:12

Michael Dz


after installing pandas datareader with:

pip install pandas-datareader

You can request the stock prices with this code:

import pandas_datareader as pdr
from datetime import datetime

appl = pdr.get_data_yahoo(symbols='AAPL', start=datetime(2000, 1, 1), end=datetime(2012, 1, 1))
print(appl['Adj Close'])
like image 20
mrCarnivore Avatar answered Dec 27 '22 01:12

mrCarnivore