Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas pivot table using index data of dataframe

I want to create a pivot table from a pandas dataframe using dataframe.pivot() and include not only dataframe columns but also the data within the dataframe index. Couldn't find any docs that show how to do that. Any tips?

like image 528
user3276418 Avatar asked Feb 08 '14 13:02

user3276418


People also ask

How do I pivot a DataFrame in pandas?

DataFrame - pivot() functionThe pivot() function is used to reshaped a given DataFrame organized by given index / column values. This function does not support data aggregation, multiple values will result in a MultiIndex in the columns. Column to use to make new frame's index. If None, uses existing index.

What is index in pivot table pandas?

The index specifies the row-level grouping, columns specify the column level grouping and values which are the numerical values you are looking to summarise. Basic anatomy of a pandas pivot table.

What is the difference between pivot and pivot_table in pandas?

pivot() will error with a ValueError: Index contains duplicate entries, cannot reshape if the index/column pair is not unique. In this case, consider using pivot_table() which is a generalization of pivot that can handle duplicate values for one index/column pair.

Can you index data frames?

Selecting values from particular rows and columns in a dataframe is known as Indexing. By using Indexing, we can select all rows and some columns or some rows and all columns. Let's create a sample data in a series form for better understanding of indexing. These slicing and indexing can lead to some sort of confusion.


1 Answers

Use reset_index to make the index a column:

In [45]: df = pd.DataFrame({'y': [0, 1, 2, 3, 4, 4], 'x': [1, 2, 2, 3, 1, 3]}, index=np.arange(6)*10)

In [46]: df
Out[46]: 
    x  y
0   1  0
10  2  1
20  2  2
30  3  3
40  1  4
50  3  4

In [47]: df.reset_index()
Out[47]: 
   index  x  y
0      0  1  0
1     10  2  1
2     20  2  2
3     30  3  3
4     40  1  4
5     50  3  4

So pivot uses the index as values:

In [48]: df.reset_index().pivot(index='y', columns='x')
Out[48]: 
   index        
x      1   2   3
y               
0      0 NaN NaN
1    NaN  10 NaN
2    NaN  20 NaN
3    NaN NaN  30
4     40 NaN  50    
like image 195
unutbu Avatar answered Oct 08 '22 22:10

unutbu