Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Possible to filter Spark dataframe by ISNUMERIC function?

I have a DataFrame for a table in SQL. I want to filter this DataFrame if a value of a certain column is numeric or not.

val df = sqlContext.sql("select * from myTable");
val filter = df.filter("ISNUMERIC('col_a')");

I want filter to be a dataframe of df where the values in col_a are numeric.

My current solution doesn't work. How can I achieve this?

like image 524
test acc Avatar asked Oct 16 '22 15:10

test acc


2 Answers

You can filter out as

df.filter(row => row.getAs[String]("col_a").matches("""\d+"""))

Hope this helps!

like image 80
koiralo Avatar answered Oct 20 '22 23:10

koiralo


You can cast the field in question to DECIMAL and inspect the result:

filter("CAST(col_a AS DECIMAL) IS NOT NULL")

Optionally, you can pass length and/or precision to narrow down the valid numbers to a specific maximum length:

filter("CAST(col_a AS DECIMAL(18,8)) IS NOT NULL")
like image 37
Alex Savitsky Avatar answered Oct 21 '22 00:10

Alex Savitsky