Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

.isin() with a column from a dataframe

How can I query a table using isin() with another dataframe? For example there is this dataframe, df1:

| id      | rank |
|---------|------|
| SE34SER | 1    |
| SEF3445 | 2    |
| 5W4G4F  | 3    |

I want to query a table where a column in the table isin(df1.id). I tried doing so like this:

t = (
    spark.table('mytable')
    .where(sf.col('id').isin(df1.id))
    .select('*')
).show()

However it errors:

AttributeError: 'NoneType' object has no attribute 'id'

like image 928
cs_guy Avatar asked Feb 01 '26 09:02

cs_guy


1 Answers

Unfortunately, you can't pass another dataframe's column to isin() method. You can get all the values of that column in a list and pass list to isin() method but this is not a better approach.

You can do inner join between those 2 dataframes.

df2 = spark.table('mytable')
df2.join(df1.select('id'),df1.id == df2.id, 'inner')
like image 149
Mohana B C Avatar answered Feb 03 '26 23:02

Mohana B C



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!