I do not see this in the SQL comparison documentation for Pandas. What would be the equivalent of this SQL in Pandas?
select a.var1, a.var2, b.var1, b.var2
from tablea a, tableb b
where a.var1=b.var1
and a.var2=b.var2
and a.var3 <> b.var3
I have the merge code as follows:
df = pd.merge(a, b, on=['VAR1','VAR2'], how='inner')
How do I incorporate the 'not equal' portion?
and a.var3 <> b.var3
The SQL Not Equal comparison operator (!=) is used to compare two expressions. For example, 15 !=
This main difference can mean that the two tools are separate, however, you can also perform several of the same functions in each respective tool, for example, you can create new features from existing columns in pandas, perhaps easier and faster than in SQL.
Pandasql is a python library that allows manipulation of a Pandas Dataframe using SQL. Under the hood, Pandasql creates an SQLite table from the Pandas Dataframe of interest and allow users to query from the SQLite table using SQL.
You can query the resulting frame:
a.merge(b, on=['VAR1','VAR2']).query('VAR3_x != VAR3_y')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With