e.g. If I have a string column value like "2.450", I want to get right 2 characters "50" from this column, how to get it using sql from spark 2.0.1 I am running my sql on view created from dataframe
mydf.createOrReplaceTempView("myview");
I see some people said should refer to the HQL document, then I try substring with negative argument, it works. This is simple but the reason that makes things complex is spark sql has no documentation. I do not think it's a good idea, it' not good for many people who want to use spark sql.
scala> val df = spark.sql("select a, substring(a,-2) as v from cdr");
df: org.apache.spark.sql.DataFrame = [a: string, v: string]
scala> df.show()
+-----------+---+
|a | v|
+-----------+---+
| 4.531| 31|
| 4.531| 31|
| 1.531| 31|
| 1.531| 31|
| 1.531| 31|
| 1.531| 31|
| 1.531| 31|
| 3.531| 31|
| 1.531| 31|
| 1.531| 31|
| 1.531| 31|
| 1.431| 31|
| 1.531| 31|
| 1.633| 33|
| 1.531| 31|
| 3.531| 31|
| 1.531| 31|
| 3.531| 31|
| 1.531| 31|
| 4.531| 31|
+-----------+---+
only showing top 20 rows
You can use UDF's (User Defined Function) to achieve the following result.
df =sqlCtx.sql("select getChar(column name) from myview");
here the above code will call a UDF "getChar()" and pass the column name in the view myview to the udf.
The UDF can do all the computations and return the last two digits for all the passed digits.
you also need to register the UDF.
public static UDF1<Float, Integer> getChar = new UDF1<Float, Integer>() {
public Integer call(Float input_data, String Output_data){
//write your logic here
Output_data =
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With