I am new to python so need a little help here. I have a dataframe with a url column with a link that allows me to download a CSV for each link. My aim is to create a loop/ whatever works so that I can run one command that will allow me to download,read the csv and create a dataframe for each of the rows. Any help would be appreciated. I have attached part of the dataframe below. If the link doesn't work (it probably won't you can just replace it with a link from 'https://finance.yahoo.com/quote/GOOG/history?p=GOOG' (any other company too) and navigate to download csv and use that link.
Dataframe:
Symbol Link
YI https://query1.finance.yahoo.com/v7/finance/download/YI?period1=1383609600&period2=1541376000&interval=1d&events=history&crumb=PMHbxK/sU6E
PIH https://query1.finance.yahoo.com/v7/finance/download/PIH?period1=1383609600&period2=1541376000&interval=1d&events=history&crumb=PMHbxK/sU6E
TURN https://query1.finance.yahoo.com/v7/finance/download/TURN?period1=1383609600&period2=1541376000&interval=1d&events=history&crumb=PMHbxK/sU6E
FLWS https://query1.finance.yahoo.com/v7/finance/download/FLWS?period1=1383609600&period2=1541376000&interval=1d&events=history&crumb=PMHbxK/sU6E
Thanks again.
Method #3: Using the csv module: One can directly import the csv files using the csv module and then create a data frame using that csv file.
To use Python Pandas read_csv with URL, we can call read_csv directly with a url . to call read_csv with the url with the csv to read it into a data frame.
There are multiple ways to get CSV data from URLs. From your example, namely Yahoo Finance, you can copy the Historical
data link and call it in Pandas
...
HISTORICAL_URL = "https://query1.finance.yahoo.com/v7/finance/download/GOOG?period1=1582781719&period2=1614404119&interval=1d&events=history&includeAdjustedClose=true"
df = pd.read_csv(HISTORICAL_URL)
A general pattern could involve tools like requests
or httpx
to make a GET|POST request and then get the contents to io
.
import pandas as pd
import requests
import io
url = 'https://query1.finance.yahoo.com/v7/finance/download/GOOG'
params ={'period1':1538761929,
'period2':1541443929,
'interval':'1d',
'events':'history',
'crumb':'v4z6ZpmoP98',
}
r = requests.post(url,data=params)
if r.ok:
data = r.content.decode('utf8')
df = pd.read_csv(io.StringIO(data))
To get the params, I just followed the liked and copied everything after ‘?’. Check that they match ;)
Results:
Update:
If you can see the raw csv contents directly in url, just pass the url in pd.read_csv
Example data directly from url:
data_url ='https://raw.githubusercontent.com/pandas-dev/pandas/master/pandas/tests/data/iris.csv'
df = pd.read_csv(data_url)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With