Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Print the value of a column in a dataframe that is not contained in another dataframe

I have two dataframes:

df1 = pd.DataFrame({'System':['b0001','b0002']})
df2 = pd.DataFrame({'System':['b0001']})

I want to print the value in column System of df1 that is NOT contained in column System of df2. The output should only be:

b0002

My current code is:

for i in df1.index:
    if df1.System[i] not in df2.System:
        print (df1.System[i])

But the output is:

b0001 
b0002

I cant'f figure out why it still prints out b0001. I've tried with isin and the output is the same.

Any help will be appreciated.

like image 413
Yusef Jacobs Avatar asked Jan 05 '23 01:01

Yusef Jacobs


2 Answers

A pandas way of doing this is to use isin as follows:

df1[~df1.System.isin(df2.System)]

Output:

  System
1  b0002

However, to do it the way you are doing you are missing .values:

for i in df1.index:
    if df1.System[i] not in df2.System.values:
        print (df1.System[i])

Output:

b0002
like image 136
Scott Boston Avatar answered Jan 14 '23 02:01

Scott Boston


numpy

np.setdiff1d(df1.System.values, df2.System.values)

array(['b0002'], dtype=object)
like image 44
piRSquared Avatar answered Jan 14 '23 02:01

piRSquared