Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to pivot a dataframe in Pandas? [duplicate]

I have a table in csv format that looks like this. I would like to transpose the table so that the values in the indicator name column are the new columns,

Indicator       Country         Year   Value     1               Angola          2005    6 2               Angola          2005    13 3               Angola          2005    10 4               Angola          2005    11 5               Angola          2005    5 1               Angola          2006    3 2               Angola          2006    2 3               Angola          2006    7 4               Angola          2006    3 5               Angola          2006    6 

I would like the end result to like like this:

Country    Year     1     2     3     4     5 Angola     2005     6     13    10    11    5 Angola     2006     3     2     7     3     6 

I have tried using a pandas data frame with not much success.

print(df.pivot(columns = 'Country', 'Year', 'Indicator', values = 'Value')) 

Any thoughts on how to accomplish this?

like image 420
bjurstrs Avatar asked Feb 05 '15 05:02

bjurstrs


People also ask

How do you pivot a DataFrame in Python?

DataFrame - pivot() functionThe pivot() function is used to reshaped a given DataFrame organized by given index / column values. This function does not support data aggregation, multiple values will result in a MultiIndex in the columns. Column to use to make new frame's index. If None, uses existing index.

How do I get rid of duplicates in pandas DataFrame?

Remove All Duplicate Rows from Pandas DataFrame You can set 'keep=False' in the drop_duplicates() function to remove all the duplicate rows. For E.x, df. drop_duplicates(keep=False) .

How can I find duplicate columns in pandas?

To find duplicate columns we need to iterate through all columns of a DataFrame and for each and every column it will search if any other column exists in DataFrame with the same contents already. If yes then that column name will be stored in the duplicate column set.


2 Answers

You can use pivot_table:

pd.pivot_table(df, values = 'Value', index=['Country','Year'], columns = 'Indicator').reset_index() 

this outputs:

 Indicator  Country     Year    1   2   3   4   5  0          Angola      2005    6   13  10  11  5  1          Angola      2006    3   2   7   3   6 
like image 92
JAB Avatar answered Oct 05 '22 07:10

JAB


This is a guess: it's not a ".csv" file, but a Pandas DataFrame imported from a '.csv'.

To pivot this table you want three arguments in your Pandas "pivot". e.g., if df is your dataframe:

table = df.pivot(index='Country',columns='Year',values='Value')   print (table) 

This should should give the desired output.

like image 36
Jason Sprong Avatar answered Oct 05 '22 06:10

Jason Sprong