Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to pivot multilabel table in pandas

Tags:

I am trying to import my data regarding the changes of price of different items. The data is kept in MySQL. I have imported the input dataframe df in a stacked format similar to the following:

 ID    Type      Date      Price1   Price2
0001    A     2001-09-20    30       301
0002    A     2001-09-21    31       278
0003    A     2001-09-22    28       299
0004    B     2001-09-18    18       159
0005    B     2001-09-20    21       157
0006    B     2001-09-21    21       162
0007    C     2001-09-19    58       326
0008    C     2001-09-20    61       410
0009    C     2001-09-21    67       383

And, in order to perform time series analysis, I want to convert to another format similar to:

               A               B              C
             Price1  Price2  Price1  Price2  Price1  Price2
Date   
2001-09-18   NULL     NULL    18       159    NULL    NULL
2001-09-19   NULL     NULL   NULL     NULL     58     326
2001-09-20   30       301    21        157     61     410
2001-09-21   31       278    21        168     67     383
2001-09-22   28       299    NULL     NULL    NULL    NULL

I have looked at this question. Both of the suggested ways were not what I want to achieve. The pandas documentation regarding pivot doesn't seems to mention anything about this either.

like image 249
krismath Avatar asked Feb 25 '18 07:02

krismath


People also ask

How do I create a pivot table in pandas?

Create Your Own Pandas Pivot Table in 4 Steps. Download or import the data that you want to use. In the pivot_table function, specify the DataFrame you are summarizing, along with the names for the indexes, columns and values. Specify the type of calculation you want to use, such as the mean.

How do I pivot a DataFrame in pandas?

Pandas DataFrame: pivot() functionThe pivot() function is used to reshaped a given DataFrame organized by given index / column values. This function does not support data aggregation, multiple values will result in a MultiIndex in the columns. Column to use to make new frame's index. If None, uses existing index.

How do I pivot multiple columns in pandas?

Using the Pandas pivot_table() function we can reshape the DataFrame on multiple columns in the form of an Excel pivot table. To group the data in a pivot table we will need to pass a DataFrame into this function and the multiple columns you wanted to group as an index.

How do you reshape a data frame?

melt() function is used to reshape a DataFrame from a wide to a long format. It is useful to get a DataFrame where one or more columns are identifier variables, and the other columns are unpivoted to the row axis leaving only two non-identifier columns named variable and value by default.


1 Answers

You can reshape by pivot or set_index with unstack, but then need swaplevel with sort_index for expected Multiindex in columns:

df1 = (df.drop('ID', axis=1)
         .pivot('Date','Type')
         .swaplevel(0,1, axis=1)
         .sort_index(axis=1))

df1 = (df.drop('ID', axis=1)
         .set_index(['Date','Type'])
         .unstack()
         .swaplevel(0,1, axis=1)
         .sort_index(axis=1))

df1 = (df.set_index(['Date','Type'])[['Price1','Price2']]
         .unstack()
         .swaplevel(0,1, axis=1)
         .sort_index(axis=1))

print (df1)
Type            A             B             C       
           Price1 Price2 Price1 Price2 Price1 Price2
Date                                                
2001-09-18    NaN    NaN   18.0  159.0    NaN    NaN
2001-09-19    NaN    NaN    NaN    NaN   58.0  326.0
2001-09-20   30.0  301.0   21.0  157.0   61.0  410.0
2001-09-21   31.0  278.0   21.0  162.0   67.0  383.0
2001-09-22   28.0  299.0    NaN    NaN    NaN    NaN
like image 133
jezrael Avatar answered Sep 22 '22 12:09

jezrael