pandas subtract on multiindex level matching

Tags:

I have a pandas dataframe with two multiindex levels

df
Out[202]: 
                     A         B         C         D
first second                                        
1     1      -0.080810  0.865259 -0.371148  0.346480
      2      -0.026636  1.259460 -1.109295 -0.871204
      3       0.372008 -1.778272  0.727838  0.620727
      4       0.918075  0.564741  2.027432 -1.614162
      5      -0.373527 -0.186027  0.225399  0.722733
2     1       0.344241  0.170596 -0.050763  2.692102
      2      -1.665413  0.357033 -0.691327 -0.983103
      3       1.277470 -1.841702  0.582107 -0.454315
      4       2.374108 -0.557879  0.797296  0.803622
      5      -1.001092  0.131217  0.131378 -0.636299

I also have a second dataframe consisting of two rows out of the original dataframe, corresponding to the multiindex level 'second == 1'

to_subtract = df.query('second == 1')  
to_subtract
Out[200]: 
                     A         B         C         D
first second                                        
1     1      -0.080810  0.865259 -0.371148  0.346480
2     1       0.344241  0.170596 -0.050763  2.692102

I want to subtract 'df' by 'to_subtract' on the first level, but for all values that correspond to the first level of the multiindex. Pandas knows how to subtract based on matching ALL levels of the multiindex, as shown below

df.sub(to_subtract)
Out[201]: 
                A    B    C    D
first second                    
1     1       0.0  0.0  0.0  0.0
      2       NaN  NaN  NaN  NaN
      3       NaN  NaN  NaN  NaN
      4       NaN  NaN  NaN  NaN
      5       NaN  NaN  NaN  NaN
2     1       0.0  0.0  0.0  0.0
      2       NaN  NaN  NaN  NaN
      3       NaN  NaN  NaN  NaN
      4       NaN  NaN  NaN  NaN
      5       NaN  NaN  NaN  NaN

Question is, how do I subtract all values from df[first===1] by to_subtract[first==1], and all values from df[first==2] by to_subtract[first==2]. I assume I can do a for loop and loop through everything, but I would rather avoid that if I can, since these dataframes may get large in the future.

Thanks in advance

830

asked Apr 02 '18 07:04

Matt P.

1 Answers

I think need remove second level of MultiIndex and then add parameter level=0 for align by first levels in DataFrame.sub:

to_subtract = df.query('second == 1').reset_index(level=1, drop=True)
#same as
#to_subtract = df.xs(1, level=1)
print (to_subtract)
              A         B         C         D
first                                        
1     -0.080810  0.865259 -0.371148  0.346480
2      0.344241  0.170596 -0.050763  2.692102

df1 = df.sub(to_subtract, level=0)
print (df1)
                     A         B         C         D
first second                                        
1     1       0.000000  0.000000  0.000000  0.000000
      2       0.054174  0.394201 -0.738147 -1.217684
      3       0.452818 -2.643531  1.098986  0.274247
      4       0.998885 -0.300518  2.398580 -1.960642
      5      -0.292717 -1.051286  0.596547  0.376253
2     1       0.000000  0.000000  0.000000  0.000000
      2      -2.009654  0.186437 -0.640564 -3.675205
      3       0.933229 -2.012298  0.632870 -3.146417
      4       2.029867 -0.728475  0.848059 -1.888480
      5      -1.345333 -0.039379  0.182141 -3.328401

181

answered Sep 30 '22 04:09

jezrael

Related questions
                            
                                How to set path to chromedriver in heroku chromedriver buildpack
                            
                                Obtaining summary from logistic regression(Python)
                            
                                How to run the Python program in the background in Ubuntu server
                            
                                How to execute aws glue scripts using python 2.7 from local machine?
                            
                                How can Celery workflows include dynamically generated groups?
                            
                                IntelliJ IDEA says "unresolved reference" on everything (Python, virtual environment)
                            
                                OAuth2 authorization for API in Python
                            
                                Deep multi-task learning with missing labels
                            
                                What is actually num_unit in LSTM cell circuit?
                            
                                flask run vs. python
                            
                                Creating a tree without duplicates
                            
                                How to close requests.Session()?
                            
                                How to get the default session from a tf.estimator?
                            
                                Reconstructing new data using sklearn NMF components Vs inverse_transform does not match
                            
                                *Vectorized* way to find indices of minimums for each column (excluding all already found indices)
                            
                                AssertionError: Gaps in blk ref_locs when unstack() dataframe
                            
                                Python heatmap: Change colour map and make asymmetrical
                            
                                SQLAlchemy encrypt a column without automatically decrypting upon retrieval
                            
                                How to publish custom (non-tensorflow) models using tensorflow-serving?
                            
                                all() vs for loop with break performance

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

pandas subtract on multiindex level matching

Tags:

python

pandas

multi-index

subtraction

Matt P.

People also ask

1 Answers

jezrael

Recent Activity

Donate For Us