I have a Pandas DataFrame where one column is a Series of dicts, like this:
colA colB colC
0 7 7 {'foo': 185, 'bar': 182, 'baz': 148}
1 2 8 {'foo': 117, 'bar': 103, 'baz': 155}
2 5 10 {'foo': 165, 'bar': 184, 'baz': 170}
3 3 2 {'foo': 121, 'bar': 151, 'baz': 187}
4 5 5 {'foo': 137, 'bar': 199, 'baz': 108}
I want the foo
, bar
and baz
key-value pairs from the dicts to be columns in my dataframe, such that I end up with this:
colA colB foo bar baz
0 7 7 185 182 148
1 2 8 117 103 155
2 5 10 165 184 170
3 3 2 121 151 187
4 5 5 137 199 108
How do I do that?
df = df.drop('colC', axis=1).join(pd.DataFrame(df.colC.values.tolist()))
We start by defining the DataFrame to work with, as well as a importing Pandas:
import pandas as pd
df = pd.DataFrame(
{
'colA': {0: 7, 1: 2, 2: 5, 3: 3, 4: 5},
'colB': {0: 7, 1: 8, 2: 10, 3: 2, 4: 5},
'colC': {
0: {'foo': 185, 'bar': 182, 'baz': 148},
1: {'foo': 117, 'bar': 103, 'baz': 155},
2: {'foo': 165, 'bar': 184, 'baz': 170},
3: {'foo': 121, 'bar': 151, 'baz': 187},
4: {'foo': 137, 'bar': 199, 'baz': 108},
},
}
)
The column colC
is a pd.Series
of dicts, and we can turn it into a pd.DataFrame
by turning each dict into a pd.Series
:
pd.DataFrame(df.colC.values.tolist())
# df.colC.apply(pd.Series). # this also works, but it is slow
which gives the pd.DataFrame
:
foo bar baz
0 154 190 171
1 152 130 164
2 165 125 109
3 153 128 174
4 135 157 188
So all we need to do is:
colC
into a pd.DataFrame
colC
from df
colC
with df
That can be done in a one-liner:
df = df.drop('colC', axis=1).join(pd.DataFrame(df.colC.values.tolist()))
With the contents of df
now being the pd.DataFrame
:
colA colB foo bar baz
0 2 4 154 190 171
1 4 10 152 130 164
2 4 10 165 125 109
3 3 8 153 128 174
4 10 9 135 157 188
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With