Given a dataframe that looks like this:
A B 2005-09-06 5 -2 2005-09-07 -1 3 2005-09-08 4 5 2005-09-09 -8 2 2005-09-10 -2 -5 2005-09-11 -7 9 2005-09-12 2 8 2005-09-13 6 -5 2005-09-14 6 -5
Is there a pythonic way to create a 2x2 matrix like this:
1 0 1 a b 0 c d
Where:
a = number of obs where the corresponding elements of column A and B are both positive.
b = number of obs where the corresponding elements of column A are positive and negative in column B.
c = number of obs where the corresponding elements of column A are negative and positive in column B.
d = number of obs where the corresponding elements of column A and B are both negative.
For this example the output would be:
1 0 1 2 3 0 3 1
Thanks
pandas.crosstab(index, columns) where: index: name of variable to display in the rows of the contingency table. columns: name of variable to display in the columns of the contingency table.
Creating a basic contingency table. To create a contingency table of the data in the var1 column cross-classified with the data in the var2 column, choose the Stat > Tables > Contingency > With Data menu option. Select var1 as the Row variable, choose var2 as the Column variable, and click Compute!.
Probably easiest to just use the pandas function crosstab
. Borrowing from Dyno Fu above:
import pandas as pd from StringIO import StringIO table = """dt A B 2005-09-06 5 -2 2005-09-07 -1 3 2005-09-08 4 5 2005-09-09 -8 2 2005-09-10 -2 -5 2005-09-11 -7 9 2005-09-12 2 8 2005-09-13 6 -5 2005-09-14 6 -5 """ sio = StringIO(table) df = pd.read_table(sio, sep=r"\s+", parse_dates=['dt']) df.set_index("dt", inplace=True) pd.crosstab(df.A > 0, df.B > 0)
Output:
B False True A False 1 3 True 3 2 [2 rows x 2 columns]
Also the table is usable if you want to do a Fisher exact test with scipy.stats
etc:
from scipy.stats import fisher_exact tab = pd.crosstab(df.A > 0, df.B > 0) fisher_exact(tab)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With