I'm trying to scrape data from the chart on this website: https://www.spglobal.com/spdji/en/indices/equity/sp-bmv-ipc/#overview
I found the JSON file behind the chart and tried this code to import it into pandas:
import pandas as pd
url = "https://www.spglobal.com/spdji/en/util/redesign/index-data/get-performance-data-for-datawidget-redesign.dot?indexId=92330739&getchildindex=true&returntype=T-¤cycode=MXN¤cyChangeFlag=false&language_id=1"
with urllib.request.urlopen(url) as url:
data = json.loads(url.read().decode())
df = pd.DataFrame(data, columns=['indexLevelsHolder'])
Data=df.iloc[3 , 0]
By doing so, I get the "Data" object which is a list containing the time series data in JSON format.
[{'effectiveDate': 1309406400000, 'indexId': 92330714, 'effectiveDateInEst': 1309392000000, 'indexValue': 43405.82, 'monthToDateFlag': 'N', 'quarterToDateFlag': 'N', 'yearToDateFlag': 'N', 'oneYearFlag': 'N', 'threeYearFlag': 'N', 'fiveYearFlag': 'N', 'tenYearFlag': 'Y', 'allYearFlag': 'Y', 'fetchedDate': 1626573344000, 'formattedEffectiveDate': '30-Jun-2011'}, .........
The problem is that I cannot find a way to read this JSON data and grab the columns I need (effectiveDate and indexValue).
Any way to do it? Thanks
You can use pd.json_normalize to load the Json into columns:
import json
import urllib
import pandas as pd
url = "https://www.spglobal.com/spdji/en/util/redesign/index-data/get-performance-data-for-datawidget-redesign.dot?indexId=92330739&getchildindex=true&returntype=T-¤cycode=MXN¤cyChangeFlag=false&language_id=1"
with urllib.request.urlopen(url) as url:
data = json.loads(url.read().decode())
df = pd.json_normalize(data["indexLevelsHolder"]["indexLevels"])
print(df)
Prints:
effectiveDate indexId effectiveDateInEst indexValue monthToDateFlag quarterToDateFlag yearToDateFlag oneYearFlag threeYearFlag fiveYearFlag tenYearFlag allYearFlag fetchedDate formattedEffectiveDate
0 1309406400000 92330714 1309392000000 43405.820000 N N N N N N Y Y 1626574897000 30-Jun-2011
1 1309492800000 92330714 1309478400000 43693.930000 N N N N N N Y Y 1626574897000 01-Jul-2011
2 1309752000000 92330714 1309737600000 43758.130000 N N N N N N Y Y 1626574897000 04-Jul-2011
3 1309838400000 92330714 1309824000000 43513.290000 N N N N N N Y Y 1626574897000 05-Jul-2011
...and son on.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With