I realize that dropping NaN
s from a dataframe is as easy as df.dropna
but for some reason that isn't working on mine and I'm not sure why.
Here is my original dataframe:
fish_frame1: 0 1 2 3 4 5 6 7 0 #0915-8 NaN NaN NaN NaN NaN NaN NaN 1 NaN NaN NaN LIVE WGT NaN AMOUNT NaN TOTAL 2 GBW COD NaN NaN 2,280 NaN $0.60 NaN $1,368.00 3 POLLOCK NaN NaN 1,611 NaN $0.01 NaN $16.11 4 WHAKE NaN NaN 441 NaN $0.70 NaN $308.70 5 GBE HADDOCK NaN NaN 2,788 NaN $0.01 NaN $27.88 6 GBW HADDOCK NaN NaN 16,667 NaN $0.01 NaN $166.67 7 REDFISH NaN NaN 932 NaN $0.01 NaN $9.32 8 GB WINTER FLOUNDER NaN NaN 145 NaN $0.25 NaN $36.25 9 GOM WINTER FLOUNDER NaN NaN 25,070 NaN $0.35 NaN $8,774.50 10 GB YELLOWTAIL NaN NaN 26 NaN $1.75 NaN $45.50
The code that follows is an attempt to drop all NaN
s as well as any columns with more than 3 NaN
s (either one, or both, should work I think):
fish_frame.dropna() fish_frame.dropna(thresh=len(fish_frame) - 3, axis=1)
This produces:
fish_frame1 after dropna: 0 1 2 3 4 5 6 7 0 #0915-8 NaN NaN NaN NaN NaN NaN NaN 1 NaN NaN NaN LIVE WGT NaN AMOUNT NaN TOTAL 2 GBW COD NaN NaN 2,280 NaN $0.60 NaN $1,368.00 3 POLLOCK NaN NaN 1,611 NaN $0.01 NaN $16.11 4 WHAKE NaN NaN 441 NaN $0.70 NaN $308.70 5 GBE HADDOCK NaN NaN 2,788 NaN $0.01 NaN $27.88 6 GBW HADDOCK NaN NaN 16,667 NaN $0.01 NaN $166.67 7 REDFISH NaN NaN 932 NaN $0.01 NaN $9.32 8 GB WINTER FLOUNDER NaN NaN 145 NaN $0.25 NaN $36.25 9 GOM WINTER FLOUNDER NaN NaN 25,070 NaN $0.35 NaN $8,774.50 10 GB YELLOWTAIL NaN NaN 26 NaN $1.75 NaN $45.50
I'm a novice with Pandas so I'm not sure if this isn't working because I'm doing something wrong or I'm misunderstanding something or misusing a function. Any help is appreciated thanks.
Use dropna(axis=0) to drop rows with NaN values from pandas DataFrame.
If we need to drop such columns that contain NA, we can use the axis=column s parameter of DataFrame. dropna() to specify deleting the columns. By default, it removes the column where one or more values are missing.
By using dropna() method you can drop rows with NaN (Not a Number) and None values from pandas DataFrame. Note that by default it returns the copy of the DataFrame after removing rows. If you wanted to remove from the existing DataFrame, you should use inplace=True .
From the dropna
docstring:
df.dropna(axis=1, how='all') A B D 0 NaN 2.0 0 1 3.0 4.0 1 2 NaN NaN 5
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With