I have a dataframe that has one of the columns as a dictionary. I want to unpack it into multiple columns (i.e. code, amount are separate columns in the below Raw column format). The following code used to work with pandas v0.22, now (0.23) giving an index error:
pd.DataFrame.from_records(df.col_name.fillna(pd.Series([{'code':'not applicable'}], index=df.index)).values.tolist())
ValueError: Length of passed values is 1, index implies x
I searched google/stack overflow for hours and none of the other solutions previously presented work anymore.
Raw column format:
dict_codes
0 {'code': 'xx', 'amount': '10.00',...
1 {'code': 'yy', 'amount': '20.00'...
2 {'code': 'bb', 'amount': '30.00'...
3 {'code': 'aa', 'amount': '40.00'...
10 {'code': 'zz', 'amount': '50.00'...
11 NaN
12 NaN
13 NaN
Does anyone have any suggestions?
Thanks
df = pd.DataFrame(dict(
codes=[
{'amount': 12, 'code': 'a'},
{'amount': 19, 'code': 'x'},
{'amount': 37, 'code': 'm'},
np.nan,
np.nan,
np.nan,
]
))
df
codes
0 {'amount': 12, 'code': 'a'}
1 {'amount': 19, 'code': 'x'}
2 {'amount': 37, 'code': 'm'}
3 NaN
4 NaN
5 NaN
apply
with pd.Series
Make sure to dropna
first
df.codes.dropna().apply(pd.Series)
amount code
0 12 a
1 19 x
2 37 m
df.drop('codes', 1).assign(**df.codes.dropna().apply(pd.Series))
amount code
0 12.0 a
1 19.0 x
2 37.0 m
3 NaN NaN
4 NaN NaN
5 NaN NaN
tolist
and from_records
Same idea but skip the apply
pd.DataFrame.from_records(df.codes.dropna().tolist())
amount code
0 12 a
1 19 x
2 37 m
df.drop('codes', 1).assign(**pd.DataFrame.from_records(df.codes.dropna().tolist()))
amount code
0 12.0 a
1 19.0 x
2 37.0 m
3 NaN NaN
4 NaN NaN
5 NaN NaN
Setup
codes
0 {'amount': 12, 'code': 10}
1 {'amount': 3, 'code': 3}
apply
with pd.Series
df.codes.apply(pd.Series)
amount code
0 12 10
1 3 3
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With