Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas column values to columns?

Tags:

python

pandas

I've seen a few variations on the theme of exploding a column/series into multiple columns of a Pandas dataframe, but I've been trying to do something and not really succeeding with the existing approaches.

Given a DataFrame like so:

    key       val id 2   foo   oranges 2   bar   bananas 2   baz    apples 3   foo    grapes 3   bar     kiwis 

I want to convert the items in the key series into columns, with the val values serving as the values, like so:

        foo        bar        baz id 2   oranges    bananas     apples 3    grapes      kiwis        NaN 

I feel like this is something that should be relatively straightforward, but I've been bashing my head against this for a few hours now with increasing levels of convolution, and no success.

like image 381
Idan Gazit Avatar asked Oct 08 '14 11:10

Idan Gazit


People also ask

How do I convert one column to multiple columns in pandas?

We can use str. split() to split one column to multiple columns by specifying expand=True option. We can use str. extract() to exract multiple columns using regex expression in which multiple capturing groups are defined.

How get values from column in pandas?

You can use the loc and iloc functions to access columns in a Pandas DataFrame. Let's see how. If we wanted to access a certain column in our DataFrame, for example the Grades column, we could simply use the loc function and specify the name of the column in order to retrieve it.

How do I assign a value to a column in pandas DataFrame?

Pandas DataFrame: assign() functionThe assign() function is used to assign new columns to a DataFrame. Returns a new object with all original columns in addition to new ones. Existing columns that are re-assigned will be overwritten. The column names are keywords.


2 Answers

There are a few ways:

using .pivot_table:

>>> df.pivot_table(values='val', index=df.index, columns='key', aggfunc='first') key      bar     baz      foo id                            2    bananas  apples  oranges 3      kiwis     NaN   grapes 

using .pivot:

>>> df.pivot(index=df.index, columns='key')['val'] key      bar     baz      foo id                            2    bananas  apples  oranges 3      kiwis     NaN   grapes 

using .groupby followed by .unstack:

>>> df.reset_index().groupby(['id', 'key'])['val'].aggregate('first').unstack() key      bar     baz      foo id                            2    bananas  apples  oranges 3      kiwis     NaN   grapes 
like image 114
behzad.nouri Avatar answered Sep 20 '22 03:09

behzad.nouri


You could use set_index and unstack

In [1923]: df.set_index([df.index, 'key'])['val'].unstack() Out[1923]: key      bar     baz      foo id 2    bananas  apples  oranges 3      kiwis    None   grapes 

Or, a simplified groupby

In [1926]: df.groupby([df.index, 'key'])['val'].first().unstack() Out[1926]: key      bar     baz      foo id 2    bananas  apples  oranges 3      kiwis    None   grapes 
like image 34
Zero Avatar answered Sep 19 '22 03:09

Zero