thresh in dropna for DataFrame in pandas in python

Tags:

pandas

df1 = pd.DataFrame(np.arange(15).reshape(5,3))
df1.iloc[:4,1] = np.nan
df1.iloc[:2,2] = np.nan
df1.dropna(thresh=1 ,axis=1)

It seems that no nan value has been deleted.

Click to copy

    0     1     2
0   0   NaN   NaN
1   3   NaN   NaN
2   6   NaN   8.0
3   9   NaN  11.0
4  12  13.0  14.0

if i run

Click to copy

df1.dropna(thresh=2,axis=1)

why it gives the following?

Click to copy

    0     2
0   0   NaN
1   3   NaN
2   6   8.0
3   9  11.0
4  12  14.0

i just dont understand what thresh is doing here. If a column has more than one nan value, should the column be deleted?

697

asked Jul 29 '18 22:07

1 Answers

thresh=N requires that a column has at least N non-NaNs to survive. In the first example, both columns have at least one non-NaN, so both survive. In the second example, only the last column has at least two non-NaNs, so it survives, but the previous column is dropped.

Try setting thresh to 4 to get a better sense of what's happening.

120

answered Nov 15 '22 15:11

DYZ

Related questions
                            
                                Can't load plugin: sqlalchemy.dialects:sqlite3
                            
                                Elastisearch update by query
                            
                                Format align using a variable?
                            
                                Python Numpy: replace values in one array with corresponding values in another array
                            
                                What is the correct way to write asyncio code for use with AWS Lambda?
                            
                                importing an excel file to python
                            
                                How to use a generator as an iterable with Multiprocessing map function
                            
                                Pandas read csv is shifting columns
                            
                                Python not importing dotenv module
                            
                                How to read a specific line number in a csv with pandas
                            
                                How check if value exists in request POST? [duplicate]
                            
                                <back space> not functional in python and ipython in shell
                            
                                Why does (1 == 2 != 3) evaluate to False in Python?
                            
                                SQLAlchemy cast boolean column to int
                            
                                LinAlgError: Last 2 dimensions of the array must be square
                            
                                Is there a way to automatically correct the color balance?
                            
                                Converting epoch to datetime in PySpark data frame using udf
                            
                                Label encoding across multiple columns with same attributes in sckit-learn
                            
                                Cannot Split, A bytes-like object is required, not 'str'
                            
                                trouble in setting celery tasks backend in Python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

thresh in dropna for DataFrame in pandas in python

Tags:

python

pandas

AAA

People also ask

1 Answers

DYZ

Recent Activity

Donate For Us