Suprisingly, i can't find an analogue of SQL's "NOT IN" operator in pandas DataFrames. <pre class="prettyprint"><code>A = pd.DataFrame({'a':[6,8,3,9,5], 'b':['II','I','I','III','II']}) B = pd.DataFrame({'c':[1,2,3,4,5]}) </code></pre> I want all rows from <code>A</code>, which <code>a</code> doesn't contain values from <code>B</code>'s <code>c</code>. Something like: <pre class="prettyprint"><code>A = A[ A.a not in B.c] </code></pre>

I think you are really close - need <code>isin</code> with <code>~</code> for negate boolean mask - also instead <code>list</code> use <code>Series</code> <code>B.c</code>: <pre class="prettyprint"><code>print (~A.a.isin(B.c)) 0 True 1 True 2 False 3 True 4 False Name: a, dtype: bool A = A[~A.a.isin(B.c)] print (A) a b 0 6 II 1 8 I 3 9 III </code></pre>

Pandas analogue of SQL's "NOT IN" operator

Tags:

python

pandas

Suprisingly, i can't find an analogue of SQL's "NOT IN" operator in pandas DataFrames.

A = pd.DataFrame({'a':[6,8,3,9,5],
                       'b':['II','I','I','III','II']})

B = pd.DataFrame({'c':[1,2,3,4,5]})

I want all rows from A, which a doesn't contain values from B's c. Something like:

A = A[ A.a not in B.c]

976

asked Apr 06 '17 13:04

Ladenkov Vladislav

1 Answers

I think you are really close - need isin with ~ for negate boolean mask - also instead list use Series B.c:

print (~A.a.isin(B.c))
0     True
1     True
2    False
3     True
4    False
Name: a, dtype: bool

A = A[~A.a.isin(B.c)]
print (A)
   a    b
0  6   II
1  8    I
3  9  III

175

answered Sep 20 '22 12:09

jezrael

Related questions
                            
                                How to get value between two different tags using beautiful soup?
                            
                                Custom Theano Op to do numerical integration
                            
                                _tkinter.TclError: encountered an unsupported criticial chunk type "exIf"
                            
                                Matplotlib Colorbar change ticks labels and locators
                            
                                Python Logger object is not callable
                            
                                TensorFlow: Incompatible shapes: [100,155] vs. [128,155] when combining CNN and LSTM
                            
                                No module named numpy with pypy
                            
                                Python - measure amount of memory used in script
                            
                                How to add custom AxisItem to existing PlotWidget?
                            
                                Django refresh_from_db for ForeignKey
                            
                                Detect DPI/scaling factor in Python TkInter application
                            
                                cx_Oracle.InterfaceError: Unable to acquire Oracle environment handle Mac
                            
                                Control individual linewidths in seaborn heatmap
                            
                                Using a string to define Numpy array slice
                            
                                color percentage in image python opencv using histogram
                            
                                Make an input optional in Python [duplicate]
                            
                                Get Data JSON in Flask
                            
                                Sort each column of an numpy.ndarray using the output of numpy.argsort
                            
                                Measure of image similarity for feature matching?
                            
                                Django fails to find psycopg2 module

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With