I have found examples of how to remove a column based on all or a threshold but I have not been able to find a solution to my particular problem which is dropping the column if the last row is nan. The reason for this is im using time series data in which the collection of data doesnt all start at the same time which is fine but if I used one of the previous solutions it would remove 95% of the dataset. I do however not want data whose most recent column is nan as it means its defunct.
A B C
nan t x
1 2 3
x y z
4 nan 6
Returns
A C
nan x
1 3
x z
4 6
Alternatively, you can also use axis=1 as a param to remove columns with NaN, for example df. dropna(axis=1) . Use dropna(axis=0) to drop rows with NaN values from pandas DataFrame.
By using dropna() method you can drop rows with NaN (Not a Number) and None values from pandas DataFrame. Note that by default it returns the copy of the DataFrame after removing rows. If you wanted to remove from the existing DataFrame, you should use inplace=True .
Pandas DataFrame dropna() Function how: possible values are {'any', 'all'}, default 'any'. If 'any', drop the row/column if any of the values is null. If 'all', drop the row/column if all the values are missing. thresh: an int value to specify the threshold for the drop operation.
For mean, use the mean() function. Calculate the mean for the column with NaN and use the fillna() to fill the NaN values with the mean.
Setting how = ‘all’ – Drops the row or column only if all the values are NaN. how = 'any', for deleting a row if any column has NaN. After dropping the rows, the indexes will not be sequential. reset_index (drop = True) deletes the previous indexes and reorders them.
1 drop all rows that have any NaN (missing) values. 2 drop only if entire row has NaN (missing) values. 3 drop only if a row has more than 2 NaN (missing) values. 4 drop NaN (missing) in a specific column.
Python / December 7, 2020 Here are 2 ways to drop columns with NaN values in Pandas DataFrame: (1) Drop any column that contains at least one NaN: df = df.dropna (axis='columns')
No, you have to set how='all' since OP asked to remove a row if both columns are NaN. Your solution will also remove rows where only one of the two columns contains NaNs.
You can also do something like this
df.loc[:, ~df.iloc[-1].isna()]
A C
0 NaN x
1 1 3
2 x z
3 4 6
Try with dropna
df = df.dropna(axis=1, subset=[df.index[-1]], how='any')
Out[8]:
A C
0 NaN x
1 1 3
2 x z
3 4 6
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With