Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unpivot Pandas Data

I currently have a DataFrame laid out as:

        Jan Feb Mar Apr ... 2001    1   12  12  19   2002    9   ... 2003    ... 

and I would like to "unpivot" the data to look like:

Date    Value Jan 2001    1 Feb 2001    1 Mar 2001    12 ... Jan 2002    9 

What is the best way to accomplish this using pandas/NumPy?

like image 668
Alex Rothberg Avatar asked Aug 15 '13 18:08

Alex Rothberg


People also ask

How do I Unpivot a pandas DataFrame?

In pandas, you can use the melt() function to unpivot a DataFrame – converting it from a wide format to a long format. This function uses the following basic syntax: df_unpivot = pd. melt(df, id_vars='col1', value_vars=['col2', 'col3', ...])

How do you Unmelt data in Python?

We can use pivot() function to unmelt a DataFrame object and get the original dataframe. The pivot() function 'index' parameter value should be same as the 'id_vars' value. The 'columns' value should be passed as the name of the 'variable' column. The unmelted DataFrame values are the same as the original DataFrame.

What does pandas melt do?

Pd. melt allows you to 'unpivot' data from a 'wide format' into a 'long format', perfect for my task taking 'wide format' economic data with each column representing a year, and turning it into 'long format' data with each row representing a data point.

What does it mean to Unpivot data?

An unpivot transformation is one way to transform data from a short/wide to a tall/skinny format. When the data types of source columns differ, the varying data is converted to a common data type so the source data can be part of one single column in the new data set.


1 Answers

You just have to do df.unstack() and that will create a MultiIndexed Series with month as a first level and the year as the second level index. If you want them to be columns then just call reset_index() after that.

>>> df       Jan  Feb 2001    3    4 2002    2    7 >>> df.unstack() Jan  2001    3      2002    2 Feb  2001    4      2002    7 >>> df = df.unstack().reset_index(name='value') >>> df   level_0  level_1  value 0     Jan     2001      3 1     Jan     2002      2 2     Feb     2001      4 3     Feb     2002      7 >>> df.rename(columns={'level_0': 'month', 'level_1': 'year'}, inplace=True) >>> df   month  year  value 0   Jan  2001      3 1   Jan  2002      2 2   Feb  2001      4 3   Feb  2002      7 
like image 96
Viktor Kerkez Avatar answered Sep 24 '22 10:09

Viktor Kerkez