Python pandas: how to remove nan and -inf values

People also ask

How do I remove NaN values from a DataFrame in Python?

By using dropna() method you can drop rows with NaN (Not a Number) and None values from pandas DataFrame. Note that by default it returns the copy of the DataFrame after removing rows. If you wanted to remove from the existing DataFrame, you should use inplace=True .

How do I remove NaN values from a series?

In the pandas series constructor, the method called dropna() is used to remove missing values from a series object. And it does not update the original series object with removed NaN values instead of updating the original series object, it will return another series object with updated values.

How do I remove missing values from a data set in Python?

DataFrame-dropna() function The dropna() function is used to remove missing values. Determine if rows or columns which contain missing values are removed. 0, or 'index' : Drop rows which contain missing values. 1, or 'columns' : Drop columns which contain missing value.

Use pd.DataFrame.isin and check for rows that have any with pd.DataFrame.any. Finally, use the boolean array to slice the dataframe.

df[~df.isin([np.nan, np.inf, -np.inf]).any(1)]

             time    X    Y  X_t0     X_tp0   X_t1     X_tp1   X_t2     X_tp2
4        0.037389    3   10     3  0.333333    2.0  0.500000    1.0  1.000000
5        0.037393    4   10     4  0.250000    3.0  0.333333    2.0  0.500000
1030308  9.962213  256  268   256  0.000000  256.0  0.003906  255.0  0.003922

You can replace inf and -inf with NaN, and then select non-null rows.

df[df.replace([np.inf, -np.inf], np.nan).notnull().all(axis=1)]  # .astype(np.float64) ?

df.replace([np.inf, -np.inf], np.nan).dropna(axis=1)

Check the type of your columns returns to make sure they are all as expected (e.g. np.float32/64) via df.info().

df.replace([np.inf, -np.inf], np.nan)

df.dropna(inplace=True)

Instead of dropping rows which contain any nulls and infinite numbers, it is more succinct to the reverse the logic of that and instead return the rows where all cells are finite numbers. The numpy isfinite function does this and the '.all(1)' will only return a TRUE if all cells in row are finite.

df = df[np.isfinite(df).all(1)]

I prefer to set the options so that inf values are calculated to nan;

s1 = pd.Series([0, 1, 2])
s2 = pd.Series([2, 1, 0])
s1/s2
# Outputs:
# 0.0
# 1.0
# inf
# dtype: float64

pd.set_option('mode.use_inf_as_na', True)
s1/s2
# Outputs:
# 0.0
# 1.0
# NaN
# dtype: float64

Note you can also use context;

with pd.option_context('mode.use_inf_as_na', True):
    print(s1/s2)
# Outputs:
# 0.0
# 1.0
# NaN
# dtype: float64

Related questions
                            
                                Disable the output of matplotlib pyplot
                            
                                Use endswith with multiple extensions
                            
                                Parse raw HTTP Headers
                            
                                Python def function: How do you specify the end of the function?
                            
                                Tracking progress of joblib.Parallel execution
                            
                                How to show a pandas dataframe into a existing flask html table?
                            
                                Convert numpy type to python
                            
                                Error 111 connecting to localhost:6379. Connection refused. Django Heroku
                            
                                Python-redis keys() returns list of bytes objects instead of strings
                            
                                Fastest way to swap elements in Python list
                            
                                Problems using psycopg2 on Mac OS (Yosemite)
                            
                                python 3 try-except all with error [duplicate]
                            
                                Is close() necessary when using iterator on a Python file object [duplicate]
                            
                                How to create a delayed queue in RabbitMQ?
                            
                                Get a list of all installed applications in Django and their attributes
                            
                                how to add annotate data in django-rest-framework queryset responses?
                            
                                python: scatter plot logarithmic scale
                            
                                Page not found 404 Django media files
                            
                                Selenium testing without browser
                            
                                Check if all values of iterable are zero

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Python pandas: how to remove nan and -inf values

Tags:

python

python-3.x

pandas

dataframe

numpy

People also ask

Recent Activity

Donate For Us