What is the equivalent in Pyspark for LIKE operator? For example I would like to do:
SELECT * FROM table WHERE column LIKE "*somestring*";
looking for something easy like this (but this is not working):
df.select('column').where(col('column').like("*s*")).show()
In Spark & PySpark like() function is similar to SQL LIKE operator that is used to match based on wildcard characters (percentage, underscore) to filter the rows. You can use this function to filter the DataFrame rows by single or multiple conditions, to derive a new column, use it on when().
PySpark When Otherwise – when() is a SQL function that returns a Column type and otherwise() is a function of Column, if otherwise() is not used, it returns a None/NULL value. PySpark SQL Case When – This is similar to SQL expression, Usage: CASE WHEN cond1 THEN result WHEN cond2 THEN result... ELSE result END .
You can use where
and col
functions to do the same. where
will be used for filtering of data based on a condition (here it is, if a column is like '%string%'
). The col('col_name')
is used to represent the condition and like
is the operator:
df.where(col('col1').like("%string%")).show()
Using spark 2.0.0 onwards following also works fine:
df.select('column').where("column like '%s%'").show()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With