By using dropna() method you can drop rows with NaN (Not a Number) and None values from pandas DataFrame. Note that by default it returns the copy of the DataFrame after removing rows. If you wanted to remove from the existing DataFrame, you should use inplace=True .
In the pandas series constructor, the method called dropna() is used to remove missing values from a series object. And it does not update the original series object with removed NaN values instead of updating the original series object, it will return another series object with updated values.
DataFrame-dropna() function The dropna() function is used to remove missing values. Determine if rows or columns which contain missing values are removed. 0, or 'index' : Drop rows which contain missing values. 1, or 'columns' : Drop columns which contain missing value.
Use pd.DataFrame.isin
and check for rows that have any with pd.DataFrame.any
. Finally, use the boolean array to slice the dataframe.
df[~df.isin([np.nan, np.inf, -np.inf]).any(1)]
time X Y X_t0 X_tp0 X_t1 X_tp1 X_t2 X_tp2
4 0.037389 3 10 3 0.333333 2.0 0.500000 1.0 1.000000
5 0.037393 4 10 4 0.250000 3.0 0.333333 2.0 0.500000
1030308 9.962213 256 268 256 0.000000 256.0 0.003906 255.0 0.003922
You can replace inf
and -inf
with NaN
, and then select non-null rows.
df[df.replace([np.inf, -np.inf], np.nan).notnull().all(axis=1)] # .astype(np.float64) ?
or
df.replace([np.inf, -np.inf], np.nan).dropna(axis=1)
Check the type of your columns returns to make sure they are all as expected (e.g. np.float32/64) via df.info()
.
df.replace([np.inf, -np.inf], np.nan)
df.dropna(inplace=True)
Instead of dropping rows which contain any nulls and infinite numbers, it is more succinct to the reverse the logic of that and instead return the rows where all cells are finite numbers. The numpy isfinite function does this and the '.all(1)' will only return a TRUE if all cells in row are finite.
df = df[np.isfinite(df).all(1)]
I prefer to set the options so that inf values are calculated to nan;
s1 = pd.Series([0, 1, 2])
s2 = pd.Series([2, 1, 0])
s1/s2
# Outputs:
# 0.0
# 1.0
# inf
# dtype: float64
pd.set_option('mode.use_inf_as_na', True)
s1/s2
# Outputs:
# 0.0
# 1.0
# NaN
# dtype: float64
Note you can also use context;
with pd.option_context('mode.use_inf_as_na', True):
print(s1/s2)
# Outputs:
# 0.0
# 1.0
# NaN
# dtype: float64
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With