Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Subtract one Pandas Series from another without creating NaN values for missing data

When you subtract one series from another, the result contains NaNs if there is no record in the second series to subtract.

In [1]: import pandas as pd

In [2]: a = pd.Series({1: 100, 2: 102, 3: 103, 4: 104})

In [3]: a
Out[3]:
1    100
2    102
3    103
4    104
dtype: int64

In [4]: b = pd.Series({1: 5, 3: 7})

In [5]: a - b
Out[5]:
1    95.0
2     NaN
3    96.0
4     NaN
dtype: float64

Is there an easier way to do the subtraction with the assumption that missing data is equivalent to zero? Other than by manually inserting zeros like this:

In [15]: b_dash = pd.Series(b, index=a.index).fillna(0.0)

In [16]: a - b_dash
Out[16]:
1     95.0
2    102.0
3     96.0
4    104.0
dtype: float64
like image 574
Bill Avatar asked Dec 23 '22 10:12

Bill


2 Answers

Using sub

a.sub(b,fill_value=0)
Out[213]: 
1     95.0
2    102.0
3     96.0
4    104.0
dtype: float64
like image 176
BENY Avatar answered Dec 26 '22 00:12

BENY


Alternatively, you can perform the subtraction and then fillna with your base Series (in this case a):

(a-b).fillna(a)

# 1     95.0
# 2    102.0
# 3     96.0
# 4    104.0

Or, you can use pd.Series.update (which will modify a in-place where there's a match with the indexes from a-b):

a.update(a-b)

# 1     95
# 2    102
# 3     96
# 4    104
like image 23
cmaher Avatar answered Dec 26 '22 00:12

cmaher