How can I use the same column as used in 'values' for 'column' or 'index'? For example: <pre class="prettyprint"><code>pd.pivot_table(data, values='Survived', index=['Survived', 'Sex', 'Pclass'], aggfunc=len, margins=True) </code></pre> values and index use the same column Survived. When I try to run the above I get <pre class="prettyprint"><code>ValueError: Grouper for 'Survived' not 1-dimensional </code></pre> However, if instead of values='Survived' I use another column, the pivot_table works fine.

One issue I'm seeing is that you haven't set the <code>columns</code> argument when calling <code>pivot_table</code> (which tells pandas what values to use as the column headers for the <code>pivot_table</code> output). A pivot table operation is actually a succession of <code>groupby -> aggregate -> unstack</code>. Say you have this <code>DataFrame</code>: <pre class="prettyprint"><code> survived sex pclass other 0 False f a 29 1 True f b 6 2 True f b 22 3 False m b 55 4 False f a 59 .. ... .. ... ... 95 False f a 66 96 False f c 42 97 True m c 93 98 True m c 59 99 False f b 93 </code></pre> You can pivot this table using <code>pivot_table</code>: <pre class="prettyprint"><code>pd.pivot_table(df, index='sex', columns='pclass', values='other', aggfunc=sum) </code></pre> <pre class="prettyprint"><code>pclass a b c sex f 1000 840 306 m 728 851 1247 </code></pre> Or you can get the same result using <code>groupby</code> and <code>unstack</code>: <pre class="prettyprint"><code>df.groupby(['sex', 'pclass'])['other'].sum().unstack() </code></pre> <pre class="prettyprint"><code>pclass a b c sex f 1000 840 306 m 728 851 1247 </code></pre> The point of this short story is that pivot tables are actually <code>groupby</code> operations. In your case, you're trying to group by <code>['Survived', 'Sex', 'Pclass']</code> and aggregate <code>'Survived'</code> again using <code>len</code>. That doesn't make much sense since <code>'Survived'</code> is already part of output table index (which is why <code>pivot_table</code> gives you an error). You can, if you really want to make this work, use <code>groupby</code> instead: <pre class="prettyprint"><code>df.groupby(['survived', 'sex', 'pclass', 'other']['survived'].apply(len).unstack() </code></pre> However, I think you actually want to achieve something else, not sure what though.

pandas pivot table value as column or index

Tags:

pandas

How can I use the same column as used in 'values' for 'column' or 'index'?

For example:

pd.pivot_table(data, values='Survived', index=['Survived', 'Sex', 'Pclass'],
               aggfunc=len, margins=True)

values and index use the same column Survived. When I try to run the above I get

ValueError: Grouper for 'Survived' not 1-dimensional

However, if instead of values='Survived' I use another column, the pivot_table works fine.

893

asked Mar 14 '16 21:03

tadalendas

1 Answers

One issue I'm seeing is that you haven't set the columns argument when calling pivot_table (which tells pandas what values to use as the column headers for the pivot_table output).

A pivot table operation is actually a succession of groupby -> aggregate -> unstack. Say you have this DataFrame:

    survived sex pclass  other
0      False   f      a     29
1       True   f      b      6
2       True   f      b     22
3      False   m      b     55
4      False   f      a     59
..       ...  ..    ...    ...
95     False   f      a     66
96     False   f      c     42
97      True   m      c     93
98      True   m      c     59
99     False   f      b     93

You can pivot this table using pivot_table:

pd.pivot_table(df, index='sex', columns='pclass', values='other', aggfunc=sum)

pclass     a    b     c
sex                    
f       1000  840   306
m        728  851  1247

Or you can get the same result using groupby and unstack:

df.groupby(['sex', 'pclass'])['other'].sum().unstack()

pclass     a    b     c
sex                    
f       1000  840   306
m        728  851  1247

The point of this short story is that pivot tables are actually groupby operations. In your case, you're trying to group by ['Survived', 'Sex', 'Pclass'] and aggregate 'Survived' again using len. That doesn't make much sense since 'Survived' is already part of output table index (which is why pivot_table gives you an error).

You can, if you really want to make this work, use groupby instead:

df.groupby(['survived', 'sex', 'pclass', 'other']['survived'].apply(len).unstack()

However, I think you actually want to achieve something else, not sure what though.

160

answered Nov 15 '22 11:11

hbot

Related questions
                            
                                Timegrouper part of pandas [duplicate]
                            
                                Set the headers using pandas.read_csv
                            
                                Efficiently expand lines from pandas DataFrame
                            
                                Pandas Panel fancy indexing: How to return (index of) all DataFrames in Panel based on Boolean of multiple columns in each df
                            
                                How to customize headers and column widths of DataFrame display?
                            
                                Assigning to slices of pandas DataFrames
                            
                                Create csv file with metadata header followed by timeseries in Python / Pandas
                            
                                Converting long integers to strings in pandas (to avoid scientific notation)
                            
                                Pandas SparseDataFrame from list of dicts
                            
                                How can I import submodules of pandas without importing matplotlib?
                            
                                Pandas, Computing total sum on each MultiIndex sublevel
                            
                                Understanding output from recursive function
                            
                                Python Pandas to_clipboard() UnicodeEncodeError: 'ascii' codec can't encode character
                            
                                pandas Styler. How to ignore the index column from the rendered HTML
                            
                                Pandas DataFrame.to_sql() error - not all arguments converted during string formatting
                            
                                How to preserve Excel text formatting when reading/writing Excel files with Pandas?
                            
                                Pandas Granger Causality
                            
                                Day delta for dates >292 years apart

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With