Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to deal with "divide by zero" with pandas dataframes when manipulating columns? [duplicate]

I'm working with hundreds of pandas dataframes. A typical dataframe is as follows:

import pandas as pd
import numpy as np
data = 'filename.csv'
df = pd.DataFrame(data)
df 

        one       two     three  four   five
a  0.469112 -0.282863 -1.509059  bar   True
b  0.932424  1.224234  7.823421  bar  False
c -1.135632  1.212112 -0.173215  bar  False
d  0.232424  2.342112  0.982342  unbar True
e  0.119209 -1.044236 -0.861849  bar   True
f -2.104569 -0.494929  1.071804  bar  False
....

There are certain operations whereby I'm dividing between columns values, e.g.

df['one']/df['two'] 

However, there are times where I am dividing by zero, or perhaps both

df['one'] = 0
df['two'] = 0

Naturally, this outputs the error:

ZeroDivisionError: division by zero

I would prefer for 0/0 to actually mean "there's nothing here", as this is often what such a zero means in a dataframe.

(a) How would I code this to mean "divide by zero" is 0 ?

(b) How would I code this to "pass" if divide by zero is encountered?

like image 341
ShanZhengYang Avatar asked Aug 11 '16 02:08

ShanZhengYang


People also ask

How do I stop dividing by zeros in Python?

In Python, we use a try block that contains a return statement to divide 2 numbers. If there is no division by zero error, then it will return the result. What is this? Otherwise, the except line will check if the specified exception name is a match, and then it will execute the code under the except block.

How do Pandas handle zero values?

Replace NaN Values with Zero on pandas DataFrameUse the DataFrame. fillna(0) method to replace NaN/None values with the 0 value. It doesn't change the object data but returns a new DataFrame.

How do you divide values in two DataFrames Pandas?

div() method divides element-wise division of one pandas DataFrame by another. DataFrame elements can be divided by a pandas series or by a Python sequence as well. Calling div() on a DataFrame instance is equivalent to invoking the division operator (/).


1 Answers

It would probably be more useful to use a dataframe that actually has zero in the denominator (see the last row of column two).

        one       two     three   four   five
a  0.469112 -0.282863 -1.509059    bar   True
b  0.932424  1.224234  7.823421    bar  False
c -1.135632  1.212112 -0.173215    bar  False
d  0.232424  2.342112  0.982342  unbar   True
e  0.119209 -1.044236 -0.861849    bar   True
f -2.104569  0.000000  1.071804    bar  False

>>> df.one / df.two
a   -1.658442
b    0.761639
c   -0.936904
d    0.099237
e   -0.114159
f        -inf  # <<< Note division by zero
dtype: float64

When one of the values is zero, you should get inf or -inf in the result. One way to convert these values is as follows:

df['result'] = df.one.div(df.two)

df.loc[~np.isfinite(df['result']), 'result'] = np.nan  # Or = 0 per part a) of question.
# or df.loc[np.isinf(df['result']), ...

>>> df
        one       two     three   four   five    result
a  0.469112 -0.282863 -1.509059    bar   True -1.658442
b  0.932424  1.224234  7.823421    bar  False  0.761639
c -1.135632  1.212112 -0.173215    bar  False -0.936904
d  0.232424  2.342112  0.982342  unbar   True  0.099237
e  0.119209 -1.044236 -0.861849    bar   True -0.114159
f -2.104569  0.000000  1.071804    bar  False       NaN
like image 107
Alexander Avatar answered Oct 16 '22 01:10

Alexander