Dropping infinite values from dataframes in pandas?

People also ask

How do you know if a Dataframe has infinite values?

Method 1: Use DataFrame. isinf() function to check whether the dataframe contains infinity or not. It returns boolean value. If it contains any infinity, it will return True.

How do you drop 10 rows in Pandas?

Using iloc[] to Drop First N Rows of DataFrameUse DataFrame. iloc[] the indexing syntax [n:] with n as an integer to select the first n rows from pandas DataFrame. For example df. iloc[n:] , substitute n with the integer number specifying how many rows you wanted to delete.

How do I change nan to INF?

nan in the whole dataframe.To replace infinite value in dataframe specific column this syntax “dfobj['Marks']. replace([np. inf, -np. inf], 0, inplace=True)” is used and this will replace all negative and positive infinite value with np.

The simplest way would be to first replace() infs to NaN:

df.replace([np.inf, -np.inf], np.nan, inplace=True)

and then use the dropna():

df.replace([np.inf, -np.inf], np.nan, inplace=True) \
    .dropna(subset=["col1", "col2"], how="all")

For example:

In [11]: df = pd.DataFrame([1, 2, np.inf, -np.inf])

In [12]: df.replace([np.inf, -np.inf], np.nan, inplace=True)
Out[12]:
    0
0   1
1   2
2 NaN
3 NaN

The same method would work for a Series.

With option context, this is possible without permanently setting use_inf_as_na. For example:

with pd.option_context('mode.use_inf_as_na', True):
    df = df.dropna(subset=['col1', 'col2'], how='all')

Of course it can be set to treat inf as NaN permanently with

pd.set_option('use_inf_as_na', True)

For older versions, replace use_inf_as_na with use_inf_as_null.

Use (fast and simple):

df = df[np.isfinite(df).all(1)]

This answer is based on DougR's answer in an other question. Here an example code:

import pandas as pd
import numpy as np
df=pd.DataFrame([1,2,3,np.nan,4,np.inf,5,-np.inf,6])
print('Input:\n',df,sep='')
df = df[np.isfinite(df).all(1)]
print('\nDropped:\n',df,sep='')

Result:

Input:
    0
0  1.0000
1  2.0000
2  3.0000
3     NaN
4  4.0000
5     inf
6  5.0000
7    -inf
8  6.0000

Dropped:
     0
0  1.0
1  2.0
2  3.0
4  4.0
6  5.0
8  6.0

Here is another method using .loc to replace inf with nan on a Series:

s.loc[(~np.isfinite(s)) & s.notnull()] = np.nan

So, in response to the original question:

df = pd.DataFrame(np.ones((3, 3)), columns=list('ABC'))

for i in range(3): 
    df.iat[i, i] = np.inf

df
          A         B         C
0       inf  1.000000  1.000000
1  1.000000       inf  1.000000
2  1.000000  1.000000       inf

df.sum()
A    inf
B    inf
C    inf
dtype: float64

df.apply(lambda s: s[np.isfinite(s)].dropna()).sum()
A    2
B    2
C    2
dtype: float64

Related questions
                            
                                How to use "raise" keyword in Python [duplicate]
                            
                                Argparse: Required arguments listed under "optional arguments"?
                            
                                Transpose list of lists
                            
                                how to concatenate two dictionaries to create a new one in Python? [duplicate]
                            
                                Using Pip to install packages to Anaconda Environment
                            
                                How do you generate dynamic (parameterized) unit tests in Python?
                            
                                Getting the exception value in Python
                            
                                How to count the number of files in a directory using Python
                            
                                Python Create unix timestamp five minutes in the future
                            
                                PATH issue with pytest 'ImportError: No module named YadaYadaYada'
                            
                                What is the right way to treat Python argparse.Namespace() as a dictionary?
                            
                                How to take column-slices of dataframe in pandas
                            
                                Retrieving the output of subprocess.call() [duplicate]
                            
                                How can I find the current OS in Python? [duplicate]
                            
                                Python list subtraction operation
                            
                                numpy: most efficient frequency counts for unique values in an array
                            
                                Using pickle.dump - TypeError: must be str, not bytes
                            
                                Splitting on last delimiter in Python string?
                            
                                NumPy array initialization (fill with identical values)
                            
                                Python Image Library fails with message "decoder JPEG not available" - PIL

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Dropping infinite values from dataframes in pandas?

Tags:

python

pandas

numpy

People also ask

Recent Activity

Donate For Us