I have a Pandas dataframe that looks like the below:
codes
1 [71020]
2 [77085]
3 [36415]
4 [99213, 99287]
5 [99233, 99233, 99233]
I'm trying to split the lists in df['codes']
into columns, like the below:
code_1 code_2 code_3
1 71020
2 77085
3 36415
4 99213 99287
5 99233 99233 99233
where columns that don't have a value (because the list was not that long) are filled with blanks or NaNs or something.
I've seen answers like this one and others similar to it, and while they work on lists of equal length, they all throw errors when I try to use the methods on lists of unequal length. Is there a good way do to this?
In Pandas, the apply() method can also be used to split one column values into multiple columns. The DataFrame. apply method() can execute a function on all values of single or multiple columns. Then inside that function, we can split the string value to multiple values.
split() Pandas provide a method to split string around a passed separator/delimiter. After that, the string can be stored as a list in a series or it can also be used to create multiple column data frames from a single separated string.
We can use Pandas' str. split function to split the column of interest. Here we want to split the column “Name” and we can select the column using chain operation and split the column with expand=True option. str.
Try:
pd.DataFrame(df.codes.values.tolist()).add_prefix('code_') code_0 code_1 code_2 0 71020 NaN NaN 1 77085 NaN NaN 2 36415 NaN NaN 3 99213 99287.0 NaN 4 99233 99233.0 99233.0
Include the index
pd.DataFrame(df.codes.values.tolist(), df.index).add_prefix('code_') code_0 code_1 code_2 1 71020 NaN NaN 2 77085 NaN NaN 3 36415 NaN NaN 4 99213 99287.0 NaN 5 99233 99233.0 99233.0
We can nail down all the formatting with this:
f = lambda x: 'code_{}'.format(x + 1) pd.DataFrame( df.codes.values.tolist(), df.index, dtype=object ).fillna('').rename(columns=f) code_1 code_2 code_3 1 71020 2 77085 3 36415 4 99213 99287 5 99233 99233 99233
Another solution:
In [95]: df.codes.apply(pd.Series).add_prefix('code_') Out[95]: code_0 code_1 code_2 1 71020.0 NaN NaN 2 77085.0 NaN NaN 3 36415.0 NaN NaN 4 99213.0 99287.0 NaN 5 99233.0 99233.0 99233.0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With