I have a requirement to filter a data frame based on a condition that a column value should starts with a predefined string.
I am trying following:
val domainConfigJSON = sqlContext.read
.jdbc(url, "CONFIG", prop)
.select("DID", "CONF", "KEY").filter("key like 'config.*'")
And getting exception:
Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near 'KEY = 'config.*'' at line 1
Using spark: 1.6.1
Spark Contains() Function to Search Strings in DataFrame You can use contains() function in Spark and PySpark to match the dataframe column values contains a literal string.
First N character of column in pyspark is obtained using substr() function.
In Spark & PySpark, contains() function is used to match a column value contains in a literal string (matches on part of the string), this is mostly used to filter rows on DataFrame.
You can use the startsWith
function present in Column class.
myDataFrame.filter(col("columnName").startswith("PREFIX"))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With