Pandas DataFrame.add() -- ignore missing columns

Question

I have the following two DataFrames:

>>> history
              above below
asn   country
12345 US          5     4
      MX          6     3
54321 MX          4     5
>>> current
              above below
asn   country
12345 MX          1     0
54321 MX          0     1
      US          1     0

I keep a running count of the "above" and "below" values in the history DataFrame like so:

>>> history = history.add(current, fill_value=0)
>>> history
               above  below
asn   country              
12345 MX         7.0    3.0
      US         5.0    4.0
54321 MX         4.0    6.0
      US         1.0    0.0

This works so long as there are no extra columns in the current DataFrame. However when I add an extra column:

>>> current
              above below cruft
asn   country
12345 MX          1     0   999
54321 MX          0     1   999
      US          1     0   999

I get the following:

>>> history = history.add(current, fill_value=0)
>>> history
               above  below cruft
asn   country              
12345 MX         7.0    3.0 999.0
      US         5.0    4.0   NaN
54321 MX         4.0    6.0 999.0
      US         1.0    0.0 999.0

I want this extra column to be ignored, since it's not present in both DataFrames. The desired output is just:

>>> history
               above  below
asn   country              
12345 MX         7.0    3.0
      US         5.0    4.0
54321 MX         4.0    6.0
      US         1.0    0.0

MaxU - stop WAR against UA · Accepted Answer

In [27]: history.add(current, fill_value=0)[history.columns]
Out[27]:
               above  below
asn   country
12345 MX         7.0    3.0
      US         5.0    4.0
54321 MX         4.0    6.0
      US         1.0    0.0

BENY · Answer

Ummm a new way

pd.concat([df1,df2],join ='inner',axis=0).sum(level=[0,1])

TYZ · Answer

You can first specify a list of columns you need in your final output:

cols_to_return = ["above", "below"]
history = history[cols_to_return].add(current[cols_to_return], fill_value=0)

By specifying columns beforehand really helps you track what you are doing and debugging future issues.

Pandas DataFrame.add() -- ignore missing columns

Tags:

python

pandas

dataframe

stevendesu

3 Answers

MaxU - stop WAR against UA

BENY

TYZ

Recent Activity

Donate For Us

Pandas DataFrame.add() -- ignore missing columns

Tags:

python

pandas

dataframe

stevendesu

3 Answers

MaxU - stop WAR against UA

BENY

TYZ

Related questions

Recent Activity

Donate For Us