hsp.loc[hsp['Len_old'] == hsp['Len_new']]
I try this code, it's working.
But I tried these three
hsp.loc[hsp['Type_old'] == hsp['Type_new']]
hsp.loc[hsp['Type_old'] != hsp['Type_new']]
hsp.loc[hsp['Len_old'] != hsp['Len_new']]
They are not working.
My data table hsp is like
id Type_old Type_new Len_old Len_new
1 Num Num 15 15
2 Num Char 12 12
3 Char Num 10 8
4 Num Num 4 5
5 Char Char 9 10
Is there a better approach to select rows where two columns are not queal.
Method 1: Using sqldf() Our query will be sqldf('SELECT * FROM df1 EXCEPT SELECT * FROM df2'). It will exclude all the rows from df1 that are also present in df2 and will return only rows that are only present in df1. Example 1: R.
Use the complement operator ~
hsp.loc[~(hsp['Type_old'] == hsp['Type_new'])]
which gives:
id Type_old Type_new Len_old Len_new
1 2 Num Char 12 12
2 3 Char Num 10 8
When dealing with Boolean operations, the complement operator is a handy way to invert True
with False
Ways to be confused by ==
versus !=
when comparing pd.Series
As expected
df[['Len_old', 'Len_new']].assign(NE=df.Len_old != df.Len_new)
Len_old Len_new NE
0 15 15 False
1 12 12 False
2 10 8 True
3 4 5 True
4 9 10 True
But if one of the column's values were strings!
df[['Len_old', 'Len_new']].assign(NE=df.Len_old.astype(str) != df.Len_new)
Len_old Len_new NE
0 15 15 True
1 12 12 True
2 10 8 True
3 4 5 True
4 9 10 True
Make sure both are the same types.
Your code, as piRSquared said, had an issue with types.
Besides that, you could use comparing methods, in this case pd.Series.ne
Using your data:
hsp.loc[hsp['Type_old'].ne(hsp['Type_new'])]
But again, as piRSquared mentioned, because of dtypes it didn't work.
Just in case, you have to take care about NaN
/None
values at your data... such:
hsp.loc[ ( hsp['Type_old'].ne(hsp['Type_new']) ) && (hsp['Type_old'].notna())]
In this case, .ne
has another argument, fill_value
, which fill missing data.
In addition, you could use "compare" method to show difference between two series (or DataFrames)
hsp.Len_old.compare(hsp.Len_new)
And it might return (if columns were of the same dtype):
self other
2 10.0 8.0
3 4.0 5.0
4 9.0 10.0
But just force to have another dtype:
hsp.Len_old.compare(hsp.Len_new.astype('str')) # string type new column
It will return all rows:
self other
0 15 15
1 12 12
2 10 8
3 4 5
4 9 10
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With