Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Multi-index pivoting in Pandas

Consider the following dataframe:

         item_id  hour    when        date      quantity
110   0YrKNYeEoa     1  before  2015-01-26        247286
111   0UMNiXI7op     1  before  2015-01-26        602001
112   0QBtIMN3AH     1  before  2015-01-26        981630
113   0GuKXLiWyV     1  after   2015-01-26       2203913
114   0SoFbjvXTs     1  after   2015-01-26        660183
115   0UkT257SXj     1  before  2015-01-26        689332
116   0RPjXnkiGx     1  after   2015-01-26        283090
117   0FhJ9RGsLT     1  before  2015-01-26       2024256
118   0FhGJ4MFlg     1  before  2015-01-26         74524
119   0FQhHZRXhB     1  before  2015-01-26             0
120   0FsSdJQlTB     1  before  2015-01-26             0
121   0FrrAzTFHE     1  before  2015-01-26             0
122   0FfkgBdMHi     1  before  2015-01-26             0
123   0FOnJNexRn     1  before  2015-01-26             0
124   0FcWhIdBds     1  before  2015-01-26             0
125   0F2lr0cL9t     1  before  2015-01-26       1787659

I would like to pivot it to get the table arranged as:

Index                     before           after
(item_id, hour, date)   quantityB      quantityA

When I try with:

df.pivot(index=['item_id', 'hour', 'date'], columns='when', values='quanty')

I get:

ValueError: Wrong number of items passed 8143, placement implies 3

Why?

like image 233
Amelio Vazquez-Reina Avatar asked Jan 29 '15 22:01

Amelio Vazquez-Reina


People also ask

How do I pivot a pandas DataFrame?

DataFrame - pivot() functionThe pivot() function is used to reshaped a given DataFrame organized by given index / column values. This function does not support data aggregation, multiple values will result in a MultiIndex in the columns. Column to use to make new frame's index. If None, uses existing index.

What is the difference between pivoting and unstacking?

The process of stacking pivots a level of column labels to the row index. Unstacking performs the opposite, pivoting a level of the row index into the column index.

What is index in pivot table pandas?

Levels in the pivot table will be stored in MultiIndex objects (hierarchical indexes) on the index and columns of the result DataFrame. Parameters: data : DataFrame. values : column to aggregate, optional. index: column, Grouper, array, or list of the previous.

How do I create a pivot table using pandas?

How to make a pivot table? Use the pd. pivot_table() function and specify what feature should go in the rows and columns using the index and columns parameters respectively. The feature that should be used to fill in the cell values should be specified in the values parameter.


1 Answers

If I understand what you are asking I think what you want is pandas.pivot_table(...) which you can use like so:

table = pd.pivot_table(df, index=['item_id', 'hour', 'date'], columns='when', values='quantity')

which with a sample data frame of

    item_id  hour  when      date     quantity
0       a     1  before  2015-01-26        25
1       b     1  before  2015-01-26        14
2       a     1   after  2015-01-26         4
3       d     1  before  2015-01-26        43
4       b     1   after  2015-01-26        30
5       d     1   after  2015-01-26        12

produces

when                     after  before
item_id hour date                     
a       1    2015-01-26      4      25
b       1    2015-01-26     30      14
d       1    2015-01-26     12      43
like image 151
alacy Avatar answered Sep 25 '22 12:09

alacy