Given a dataframe that looks like this:
A B 2005-09-06 5 -2 2005-09-07 -1 3 2005-09-08 4 5 2005-09-09 -8 2 2005-09-10 -2 -5 2005-09-11 -7 9 2005-09-12 2 8 2005-09-13 6 -5 2005-09-14 6 -5 Is there a pythonic way to create a 2x2 matrix like this:
1 0 1 a b 0 c d Where:
a = number of obs where the corresponding elements of column A and B are both positive.
b = number of obs where the corresponding elements of column A are positive and negative in column B.
c = number of obs where the corresponding elements of column A are negative and positive in column B.
d = number of obs where the corresponding elements of column A and B are both negative.
For this example the output would be:
1 0 1 2 3 0 3 1 Thanks
pandas.crosstab(index, columns) where: index: name of variable to display in the rows of the contingency table. columns: name of variable to display in the columns of the contingency table.
Creating a basic contingency table. To create a contingency table of the data in the var1 column cross-classified with the data in the var2 column, choose the Stat > Tables > Contingency > With Data menu option. Select var1 as the Row variable, choose var2 as the Column variable, and click Compute!.
Probably easiest to just use the pandas function crosstab. Borrowing from Dyno Fu above:
import pandas as pd from StringIO import StringIO table = """dt A B 2005-09-06 5 -2 2005-09-07 -1 3 2005-09-08 4 5 2005-09-09 -8 2 2005-09-10 -2 -5 2005-09-11 -7 9 2005-09-12 2 8 2005-09-13 6 -5 2005-09-14 6 -5 """ sio = StringIO(table) df = pd.read_table(sio, sep=r"\s+", parse_dates=['dt']) df.set_index("dt", inplace=True) pd.crosstab(df.A > 0, df.B > 0) Output:
B False True A False 1 3 True 3 2 [2 rows x 2 columns] Also the table is usable if you want to do a Fisher exact test with scipy.stats etc:
from scipy.stats import fisher_exact tab = pd.crosstab(df.A > 0, df.B > 0) fisher_exact(tab)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With