I have a dataframe with two columns(one string and one array of string):
root
|-- user: string (nullable = true)
|-- users: array (nullable = true)
| |-- element: string (containsNull = true)
How can I filter the dataframe so that the result dataframe only contains rows that user
is in users
?
Quick and simple:
import org.apache.spark.sql.functions.expr
df.where(expr("array_contains(users, user)")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With