Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pyspark dataframe LIKE operator

What is the equivalent in Pyspark for LIKE operator? For example I would like to do:

SELECT * FROM table WHERE column LIKE "*somestring*"; 

looking for something easy like this (but this is not working):

df.select('column').where(col('column').like("*s*")).show() 
like image 556
Babu Avatar asked Oct 24 '16 14:10

Babu


People also ask

How do you use the LIKE operator in PySpark?

In Spark & PySpark like() function is similar to SQL LIKE operator that is used to match based on wildcard characters (percentage, underscore) to filter the rows. You can use this function to filter the DataFrame rows by single or multiple conditions, to derive a new column, use it on when().

How do you write if condition in PySpark?

PySpark When Otherwise – when() is a SQL function that returns a Column type and otherwise() is a function of Column, if otherwise() is not used, it returns a None/NULL value. PySpark SQL Case When – This is similar to SQL expression, Usage: CASE WHEN cond1 THEN result WHEN cond2 THEN result... ELSE result END .


2 Answers

You can use where and col functions to do the same. where will be used for filtering of data based on a condition (here it is, if a column is like '%string%'). The col('col_name') is used to represent the condition and like is the operator:

df.where(col('col1').like("%string%")).show() 
like image 62
braj Avatar answered Sep 26 '22 00:09

braj


Using spark 2.0.0 onwards following also works fine:

df.select('column').where("column like '%s%'").show()

like image 39
desaiankitb Avatar answered Sep 26 '22 00:09

desaiankitb