I've seen a few variations on the theme of exploding a column/series into multiple columns of a Pandas dataframe, but I've been trying to do something and not really succeeding with the existing approaches.
Given a DataFrame like so:
    key       val id 2   foo   oranges 2   bar   bananas 2   baz    apples 3   foo    grapes 3   bar     kiwis   I want to convert the items in the key series into columns, with the val values serving as the values, like so:
        foo        bar        baz id 2   oranges    bananas     apples 3    grapes      kiwis        NaN   I feel like this is something that should be relatively straightforward, but I've been bashing my head against this for a few hours now with increasing levels of convolution, and no success.
We can use str. split() to split one column to multiple columns by specifying expand=True option. We can use str. extract() to exract multiple columns using regex expression in which multiple capturing groups are defined.
You can use the loc and iloc functions to access columns in a Pandas DataFrame. Let's see how. If we wanted to access a certain column in our DataFrame, for example the Grades column, we could simply use the loc function and specify the name of the column in order to retrieve it.
Pandas DataFrame: assign() functionThe assign() function is used to assign new columns to a DataFrame. Returns a new object with all original columns in addition to new ones. Existing columns that are re-assigned will be overwritten. The column names are keywords.
There are a few ways:
using .pivot_table:
>>> df.pivot_table(values='val', index=df.index, columns='key', aggfunc='first') key      bar     baz      foo id                            2    bananas  apples  oranges 3      kiwis     NaN   grapes   using .pivot:
>>> df.pivot(index=df.index, columns='key')['val'] key      bar     baz      foo id                            2    bananas  apples  oranges 3      kiwis     NaN   grapes   using .groupby followed by .unstack:
>>> df.reset_index().groupby(['id', 'key'])['val'].aggregate('first').unstack() key      bar     baz      foo id                            2    bananas  apples  oranges 3      kiwis     NaN   grapes 
                        You could use set_index and unstack
In [1923]: df.set_index([df.index, 'key'])['val'].unstack() Out[1923]: key      bar     baz      foo id 2    bananas  apples  oranges 3      kiwis    None   grapes   Or, a simplified groupby
In [1926]: df.groupby([df.index, 'key'])['val'].first().unstack() Out[1926]: key      bar     baz      foo id 2    bananas  apples  oranges 3      kiwis    None   grapes 
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With