Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to get right substring using sql in spark 2.0

Tags:

apache-spark

e.g. If I have a string column value like "2.450", I want to get right 2 characters "50" from this column, how to get it using sql from spark 2.0.1 I am running my sql on view created from dataframe

mydf.createOrReplaceTempView("myview");
like image 670
Dean Chen Avatar asked Jan 06 '23 05:01

Dean Chen


2 Answers

I see some people said should refer to the HQL document, then I try substring with negative argument, it works. This is simple but the reason that makes things complex is spark sql has no documentation. I do not think it's a good idea, it' not good for many people who want to use spark sql.

scala> val df = spark.sql("select a, substring(a,-2) as v from cdr");
df: org.apache.spark.sql.DataFrame = [a: string, v: string]

scala> df.show()
+-----------+---+
|a          |  v|
+-----------+---+
|      4.531| 31|
|      4.531| 31|
|      1.531| 31|
|      1.531| 31|
|      1.531| 31|
|      1.531| 31|
|      1.531| 31|
|      3.531| 31|
|      1.531| 31|
|      1.531| 31|
|      1.531| 31|
|      1.431| 31|
|      1.531| 31|
|      1.633| 33|
|      1.531| 31|
|      3.531| 31|
|      1.531| 31|
|      3.531| 31|
|      1.531| 31|
|      4.531| 31|
+-----------+---+
only showing top 20 rows
like image 101
Dean Chen Avatar answered Feb 20 '23 08:02

Dean Chen


You can use UDF's (User Defined Function) to achieve the following result.

df =sqlCtx.sql("select getChar(column name) from myview");

here the above code will call a UDF "getChar()" and pass the column name in the view myview to the udf.

The UDF can do all the computations and return the last two digits for all the passed digits.

you also need to register the UDF.

public  static UDF1<Float, Integer> getChar =  new UDF1<Float, Integer>() {
    public Integer call(Float input_data, String Output_data){

      //write your logic here
       Output_data =
    }

}
like image 20
PradhanKamal Avatar answered Feb 20 '23 09:02

PradhanKamal