My code:
d = [{"cityId": 111, "regionId": 111, 'data':[{'code': 'ABC', 'date': '2023-11-11 02:00', 'value': 300}, {'code': 'ABC', 'date': '2023-11-12 02:00', 'value': 300}]},
{"cityId": 211, "regionId": 211, 'data':[{'code': 'XYZ', 'date': '2023-11-11 02:00', 'value': 300}, {'code': 'XYZ', 'date': '2023-11-12 02:00', 'value': 300}]}]
df = pandas.DataFrame(data=d)
new_df = df.explode('data')['data']
new_df = json_normalize(new_df)
My current output:
cityId regionId data
0 111 111 [{'code': 'ABC', 'date': '2023-11-11 02:00', '...
1 211 211 [{'code': 'XYZ', 'date': '2023-11-11 02:00', '...
code date value
0 ABC 2023-11-11 02:00 300
1 ABC 2023-11-12 02:00 300
2 XYZ 2023-11-11 02:00 300
3 XYZ 2023-11-12 02:00 300
My desired output:
code date value cityId regionId
0 ABC 2023-11-11 02:00 300 111 111
1 ABC 2023-11-12 02:00 300 111 111
2 XYZ 2023-11-11 02:00 300 211 211
3 XYZ 2023-11-12 02:00 300 211 211
I suppose I should do join or merge, but when I tried those I multiply the columns. I have done this with the loop but I am asked to make my cord shorter.
Since you normalize, you can pass the meta parameter to add the 2 missing columns :
import pandas as pd
df = pd.json_normalize(d, "data", meta=["cityId", "regionId"])
Output :
print(df)
code date value cityId regionId
0 ABC 2023-11-11 02:00 300 111 111
1 ABC 2023-11-12 02:00 300 111 111
2 XYZ 2023-11-11 02:00 300 211 211
3 XYZ 2023-11-12 02:00 300 211 211
[4 rows x 5 columns]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With