I receive data in JSON format and have a hard time converting them into a suitable format. Hope you can help me.
import pandas as pd
from pandas.io.json import json_normalize
import requests
dataAPI = requests.get('here is the API URL')
print(dataAPI.json())
give me the following output:
{'c': [277.775, 277.76, 277.65, 277.64, 277.5215], 'h': [277.89, 278.06, 277.98, 277.
76, 277.98], 'l': [277.67, 277.71, 277.59, 277.42, 277.472], 'o': [277.69, 277.795, 277.77, 277.66, 277.72], 's': 'ok', 't': [1587412320, 1587412380, 1587412440, 1587412500, 1587412560, 1587412620, ], 'v': [0, 142752, 133100, 259539, 0]}
I'd like to create a dataframe with the following columns (skip column s) and float cell values:
c| h| l| o| t| v
277.775| 277.89| 277.67| 277.69| 1587412320| 0
...
I tried something along these lines json_normalize(dataAPI, 'c')
but that gave me an error message TypeError: byte indices must be integers or slices, not str
Appreciate your help a lot
you have to define your wanted columns and than just use pandas.concat:
j = {'c': [277.775, 277.76, 277.65, 277.64, 277.5215], 'h': [277.89, 278.06, 277.98, 277.76, 277.98], 'l': [277.67, 277.71, 277.59, 277.42, 277.472], 'o': [277.69, 277.795, 277.77, 277.66, 277.72], 's': 'ok', 't': [1587412320, 1587412380, 1587412440, 1587412500, 1587412560, 1587412620, ], 'v': [0, 142752, 133100, 259539, 0]}
columns = {'c', 'h', 'l', 'o', 't', 'v'}
pd.concat([pd.DataFrame({k: v}) for k, v in j.items() if k in columns], axis=1)
output:

dict1 = {'c': [277.775, 277.76, 277.65, 277.64, 277.5215],
'h': [277.89, 278.06, 277.98, 277.76, 277.98],
'l': [277.67, 277.71, 277.59, 277.42, 277.472],
'o': [277.69, 277.795, 277.77, 277.66, 277.72],
's': 'ok',
't': [1587412320, 1587412380, 1587412440, 1587412500, 1587412560, 1587412560,],
'v': [0, 142752, 133100, 259539, 0]}
For the above obtained output from the API response, you could do the following:
import pandas as pd
df1 = pd.DataFrame.from_dict(dict1, orient="index").T.drop(columns=["s"])
df1
The above code will create a dataframe from the dictionary by orienting by index (can do it by column too if the list values are equal in the dictionary) and then transposes it. A drop would indicate whichever column you would like to drop.
Output:
Out[21]:
c h l o t v
0 277.775 277.89 277.67 277.69 1587412320 0
1 277.76 278.06 277.71 277.795 1587412380 142752
2 277.65 277.98 277.59 277.77 1.58741e+09 133100
3 277.64 277.76 277.42 277.66 1.58741e+09 259539
4 277.522 277.98 277.472 277.72 1.58741e+09 0
5 NaN NaN NaN NaN 1.58741e+09 NaN
You would like to not contain a NaN, hence you can append dropna() to the code too as below:
df1 = pd.DataFrame.from_dict(dict1, orient="index").T.drop(columns=["s"]).dropna()
This way you have the flexibility to handle NaN and drop the columns not required.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With