Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas handling Dictionary inside Dataframe

My code:

d = [{"cityId": 111, "regionId": 111, 'data':[{'code': 'ABC', 'date': '2023-11-11 02:00', 'value': 300}, {'code': 'ABC', 'date': '2023-11-12 02:00', 'value': 300}]},
     {"cityId": 211, "regionId": 211, 'data':[{'code': 'XYZ', 'date': '2023-11-11 02:00', 'value': 300}, {'code': 'XYZ', 'date': '2023-11-12 02:00', 'value': 300}]}]
df = pandas.DataFrame(data=d)
new_df = df.explode('data')['data']
new_df = json_normalize(new_df)

My current output:

   cityId  regionId                                               data
0     111       111  [{'code': 'ABC', 'date': '2023-11-11 02:00', '...
1     211       211  [{'code': 'XYZ', 'date': '2023-11-11 02:00', '...
  code              date  value
0  ABC  2023-11-11 02:00    300
1  ABC  2023-11-12 02:00    300
2  XYZ  2023-11-11 02:00    300
3  XYZ  2023-11-12 02:00    300      

My desired output:

  code              date  value cityId  regionId
0  ABC  2023-11-11 02:00    300  111       111
1  ABC  2023-11-12 02:00    300  111       111
2  XYZ  2023-11-11 02:00    300  211       211
3  XYZ  2023-11-12 02:00    300  211       211

I suppose I should do join or merge, but when I tried those I multiply the columns. I have done this with the loop but I am asked to make my cord shorter.

like image 432
KurczakChrupiacy2 Avatar asked Dec 20 '25 14:12

KurczakChrupiacy2


1 Answers

Since you normalize, you can pass the meta parameter to add the 2 missing columns :

import pandas as pd

df = pd.json_normalize(d, "data", meta=["cityId", "regionId"])

Output :

print(df)

  code              date  value cityId regionId
0  ABC  2023-11-11 02:00    300    111      111
1  ABC  2023-11-12 02:00    300    111      111
2  XYZ  2023-11-11 02:00    300    211      211
3  XYZ  2023-11-12 02:00    300    211      211

[4 rows x 5 columns]
like image 103
Timeless Avatar answered Dec 23 '25 04:12

Timeless