Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Spark Data frame search column starting with a string

I have a requirement to filter a data frame based on a condition that a column value should starts with a predefined string.

I am trying following:

 val domainConfigJSON = sqlContext.read
    .jdbc(url, "CONFIG", prop)
    .select("DID", "CONF", "KEY").filter("key like 'config.*'")

And getting exception:

Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near 'KEY = 'config.*'' at line 1

Using spark: 1.6.1
like image 914
Anush Avatar asked Aug 07 '17 17:08

Anush


People also ask

How do I find a string in a column in PySpark?

Spark Contains() Function to Search Strings in DataFrame You can use contains() function in Spark and PySpark to match the dataframe column values contains a literal string.

How do you get the first letter of a string in PySpark?

First N character of column in pyspark is obtained using substr() function.

How do you check if a column contains a particular value in PySpark?

In Spark & PySpark, contains() function is used to match a column value contains in a literal string (matches on part of the string), this is mostly used to filter rows on DataFrame.


1 Answers

You can use the startsWith function present in Column class.

myDataFrame.filter(col("columnName").startswith("PREFIX"))
like image 132
jeanr Avatar answered Nov 02 '22 04:11

jeanr