Create contingency table Pandas with counts and percentages

Tags:

Is there a better way to create a contingency table in pandas with pd.crosstab() or pd.pivot_table() to generate counts and percentages.

Current solution

cat=['A','B','B','A','B','B','A','A','B','B']
target = [True,False,False,False,True,True,False,True,True,True]

import pandas as pd
df=pd.DataFrame({'cat' :cat,'target':target})

using crosstab

totals=pd.crosstab(df['cat'],df['target'],margins=True).reset_index()
percentages = pd.crosstab(df['cat'],
   df['target']).apply(lambda row: row/row.sum(),axis=1).reset_index()

and a merge

summaryTable=pd.merge(totals,percentages,on="cat")
summaryTable.columns=['cat','#False',
    '#True','All','percentTrue','percentFalse']

output

+---+-----+--------+-------+-----+-------------+--------------+
|   | cat | #False | #True | All | percentTrue | percentFalse |
+---+-----+--------+-------+-----+-------------+--------------+
| 0 | A   |      2 |     2 |   4 | 0.500000    | 0.500000     |
| 1 | B   |      2 |     4 |   6 | 0.333333    | 0.666667     |
+---+-----+--------+-------+-----+-------------+--------------+

681

asked Mar 16 '16 18:03

iboboboru

1 Answers

you can do the following:

In [131]: s = df.groupby('cat').agg({'target': ['sum', 'count']}).reset_index(level=0)

In [132]: s.columns
Out[132]:
MultiIndex(levels=[['target', 'cat'], ['sum', 'count', '']],
           labels=[[1, 0, 0], [2, 0, 1]])

Let's bring order to column names:

In [133]: s.columns = [col[1] if col[1] else col[0] for col in s.columns.tolist()]

In [134]: s
Out[134]:
  cat  sum  count
0   A  2.0      4
1   B  4.0      6

In [135]: s['pctTrue'] = s['sum']/s['count']

In [136]: s['pctFalse'] = 1 - s.pctTrue

In [137]: s
Out[137]:
  cat  sum  count   pctTrue  pctFalse
0   A  2.0      4  0.500000  0.500000
1   B  4.0      6  0.666667  0.333333

120

answered Oct 26 '22 23:10

MaxU - stop WAR against UA

Related questions
                            
                                Redirecting `sys.stdout` to a file or a buffer
                            
                                Pickle file size when pickling numpy arrays or lists
                            
                                Using Gradle to build Python application
                            
                                Using tox with Anaconda python
                            
                                How to mount and unmount on windows [closed]
                            
                                Pydot error: file format "png" not recognized
                            
                                Error while importing Tensorflow in python2.7 in Red Hat release 6.6. 'GLIBC_2.17 not found'
                            
                                Theano CUDA exception
                            
                                Spark: More Efficient Aggregation to join strings from different rows
                            
                                Why is Garbage Collection so Slow?
                            
                                Anaconda 3.5 (64bit Windows) Install cx_Oracle
                            
                                Create a formal linear function in Sympy
                            
                                TensorFlow installation results in ImportError: No module named tensorflow
                            
                                py2exe the following modules appear to be missing
                            
                                Pandas.read_excel reads date into timestamp, I want a string
                            
                                Motif search with Gibbs sampler
                            
                                run untrusted python code that is able to communicate with main program but isolated from the system
                            
                                gspread findall() only within 1 column
                            
                                What can cause the simple invocation of asyncio.new_event_loop() to hang?
                            
                                Extracting attributes from images using Scikit-image

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Create contingency table Pandas with counts and percentages

Tags:

python

pandas

pivot-table

crosstab

iboboboru

People also ask

1 Answers

MaxU - stop WAR against UA

Recent Activity

Donate For Us