Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas: Unstacking One Column of a DataFrame

Tags:

python

pandas

I want to unstack one column in my Pandas DataFrame. The DataFrame is indexed by the 'Date' and I want to unstack the 'Country' column so each Country is its own column. The current pandas DF looks like this:

             Country   Product      Flow Unit  Quantity  
Date                                                         
2002-01-31   FINLAND  KEROSENE  TOTEXPSB  KBD    3.8129     
2002-01-31    TURKEY  KEROSENE  TOTEXPSB  KBD    0.2542     
2002-01-31  AUSTRALI  KEROSENE  TOTEXPSB  KBD   12.2787     
2002-01-31    CANADA  KEROSENE  TOTEXPSB  KBD    5.1161     
2002-01-31        UK  KEROSENE  TOTEXPSB  KBD   12.2013     

When I use df.pivot I get the following error "ReshapeError: Index contains duplicate entries, cannot reshape" This is true since I'm looking at a Dates that are reported at the same time by each country. What I would like is to unstack the 'Country Column so only one Date would show for each month.

the DataFrame headers like this Date would still be the index:

Date        FINLAND TURKEY  AUSTRALI  CANADA Flow      Unit

2002-01-31  3.8129  0.2542  12.2787   5.1161 TOTEXPSB   KBD

I have worked on this for a while and I'm not getting anywhere so any direction or insight would be great.

Also, note you are only seeing the head of the DataFrame so years of Data is in this format.

Thanks,

Douglas

like image 415
user3055920 Avatar asked Feb 04 '14 13:02

user3055920


People also ask

How do I unstack a column?

Choose Data > Unstack Columns. In Unstack the data in, enter Weight. In Using subscripts in, enter Temp. From Store unstacked data, select After last column in use.

How do you shuffle a column of a DataFrame in Python?

Shuffle DataFrame Randomly by Rows and Columns You can use df. sample(frac=1, axis=1). sample(frac=1). reset_index(drop=True) to shuffle rows and columns randomly.

What does unstack () do in pandas?

Pivot a level of the (necessarily hierarchical) index labels. Returns a DataFrame having a new level of column labels whose inner-most level consists of the pivoted index labels. If the index is not a MultiIndex, the output will be a Series (the analogue of stack when the columns are not a MultiIndex).


1 Answers

If you can drop Product, Unit, and Flow then it should be as easy as

df.reset_index().pivot(columns='Country', index='Date', values='Quantity')

to give

Country  AUSTRALI    CANADA  FINLAND TURKEY  UK
Date                    
2002-01-31   12.2787     5.1161  3.8129  0.2542  12.2013
like image 153
jmz Avatar answered Oct 05 '22 13:10

jmz