Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to subtract one dataframe from another?

Tags:

pandas

First, let me set the stage.

I start with a pandas dataframe klmn, that looks like this:

In [15]: klmn
Out[15]: 
    K  L         M   N
0   0  a -1.374201  35
1   0  b  1.415697  29
2   0  a  0.233841  18
3   0  b  1.550599  30
4   0  a -0.178370  63
5   0  b -1.235956  42
6   0  a  0.088046   2
7   0  b  0.074238  84
8   1  a  0.469924  44
9   1  b  1.231064  68
10  2  a -0.979462  73
11  2  b  0.322454  97

Next I split klmn into two dataframes, klmn0 and klmn1, according to the value in the 'K' column:

In [16]: k0 = klmn.groupby(klmn['K'] == 0)
In [17]: klmn0, klmn1 = [klmn.ix[k0.indices[tf]] for tf in (True, False)]
In [18]: klmn0, klmn1
Out[18]: 
(   K  L         M   N
0  0  a -1.374201  35
1  0  b  1.415697  29
2  0  a  0.233841  18
3  0  b  1.550599  30
4  0  a -0.178370  63
5  0  b -1.235956  42
6  0  a  0.088046   2
7  0  b  0.074238  84,
     K  L         M   N
8   1  a  0.469924  44
9   1  b  1.231064  68
10  2  a -0.979462  73
11  2  b  0.322454  97)

Finally, I compute the mean of the M column in klmn0, grouped by the value in the L column:

In [19]: m0 = klmn0.groupby('L')['M'].mean(); m0
Out[19]: 
L
a   -0.307671
b    0.451144
Name: M

Now, my question is, how can I subtract m0 from the M column of the klmn1 sub-dataframe, respecting the value in the L column? (By this I mean that m0['a'] gets subtracted from the M column of each row in klmn1 that has 'a' in the L column, and likewise for m0['b'].)

One could imagine doing this in a way that replaces the the values in the M column of klmn1 with the new values (after subtracting the value from m0). Alternatively, one could imagine doing this in a way that leaves klmn1 unchanged, and instead produces a new dataframe klmn11 with an updated M column. I'm interested in both approaches.

like image 912
kjo Avatar asked Feb 18 '13 22:02

kjo


1 Answers

If you reset the index of your klmn1 dataframe to be that of the column L, then your dataframe will automatically align the indices with any series you subtract from it:

In [1]: klmn1.set_index('L')['M'] - m0
Out[1]:
L
a    0.777595
a   -0.671791
b    0.779920
b   -0.128690
Name: M
like image 84
Zelazny7 Avatar answered Sep 19 '22 16:09

Zelazny7