I'm filtering my DataFrame dropping those rows in which the cell value of a specific column is None. <pre class="prettyprint"><code>df = df[df['my_col'].isnull() == False] </code></pre> Works fine, but PyCharm tells me: <blockquote> PEP8: comparison to False should be 'if cond is False:' or 'if not cond:' </blockquote> But I wonder how I should apply this to my use-case? Using 'not ...' or ' is False' did not work. My current solution is: <pre class="prettyprint"><code>df = df[df['my_col'].notnull()] </code></pre>

So python has the short-circuiting logic operators <code>not</code>, <code>and</code>, <code>or</code>. These have a very specific meaning in python and cannot be overridden (<code>not</code> must return a <code>bool</code> and <code>a and/or b</code> always returns either <code>a</code> or <code>b</code> or throws an error. However, python also has over-loadable boolean operators <code>~</code> (not), <code>&</code> (and), <code>|</code> (or) and <code>^</code> (xor). You may recognise these as the <code>int</code> bitwise operators, but Numpy (and therefore pandas) use these to do array / series boolean operations. For example <pre class="prettyprint"><code>b = np.array([True, False, True]) & np.array([True, False, False]) # b --> [True False False] b = ~b # b --> [False True True] </code></pre> Hence what you want is <pre class="prettyprint"><code>df = df[~df['my_col'].isnull()] </code></pre> I agree with PEP8, don't do <code>== False</code>.

Python Pandas: get rows of a DataFrame where a column is not null

Tags:

python

pandas

dataframe

I'm filtering my DataFrame dropping those rows in which the cell value of a specific column is None.

df = df[df['my_col'].isnull() == False]

Works fine, but PyCharm tells me:

PEP8: comparison to False should be 'if cond is False:' or 'if not cond:'

But I wonder how I should apply this to my use-case? Using 'not ...' or ' is False' did not work. My current solution is:

df = df[df['my_col'].notnull()]

957

asked Apr 05 '18 13:04

Matthias

1 Answers

So python has the short-circuiting logic operators not, and, or. These have a very specific meaning in python and cannot be overridden (not must return a bool and a and/or b always returns either a or b or throws an error.

However, python also has over-loadable boolean operators ~ (not), & (and), | (or) and ^ (xor).

You may recognise these as the int bitwise operators, but Numpy (and therefore pandas) use these to do array / series boolean operations.

For example

b = np.array([True, False, True]) & np.array([True, False, False])
# b --> [True False False]
b = ~b 
# b --> [False True True]

Hence what you want is

df = df[~df['my_col'].isnull()]

I agree with PEP8, don't do == False.

166

answered Nov 07 '22 12:11

FHTMitchell

Related questions
                            
                                ValueError: Invalid endpoint: s3-api.xxxx.objectstorage.service.networklayer.com
                            
                                Difference between apply() and apply_async() in Python multiprocessing module
                            
                                Django restframework, extra_kwargs not working
                            
                                Django: How to redirect with arguments
                            
                                How to prevent PyCharm from overriding default backend as set in matplotlib?
                            
                                PIP (Python) : ImportError: cannot import name _remove_dead_weakref
                            
                                Filtering with MultiIndex
                            
                                Numpy array: group by one column, sum another
                            
                                What does it mean for a tensor to have shape [None, x] in TensorFlow? [duplicate]
                            
                                Calculate nunique() for groupby in pandas
                            
                                How to print weights in Tensorflow?
                            
                                `np.concatenate` a numpy array with a sparse matrix
                            
                                Properly terminate flask web app running in a thread
                            
                                How to use Keras with GPU?
                            
                                Pyspark Dataframe: Get previous row that meets a condition
                            
                                Open file from zip without extracting it in Python?
                            
                                Interactive BSpline fitting in Python
                            
                                When inheriting SQLAlchemy class from abstract class exception thrown: metaclass conflict: the metaclass of a derived class must be
                            
                                How to stop OpenCV error message from printing in Python
                            
                                Pandas group by one column concatenate values of other column as delimited list

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With