Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

rank() function usage in Spark SQL

Need some pointers in using rank()

I have extracted a column from a dataset..need to do the ranking.

Dataset<Row> inputCol= inputDataset.apply("Colname");    
Dataset<Row>  DSColAwithIndex=inputDSAAcolonly.withColumn("df1Rank", rank());

DSColAwithIndex.show();

I can sort the column and then append an index column too to get rank...but curious to known syntax and usage of rank()

like image 336
Binu Avatar asked Mar 06 '17 04:03

Binu


1 Answers

Window spec need to be specified for rank()

val w = org.apache.spark.sql.expressions.Window.orderBy("date") //some spec    

val leadDf = inputDSAAcolonly.withColumn("df1Rank", rank().over(w))

Edit: Java version of answer, as OP using Java

import org.apache.spark.sql.expressions.WindowSpec; 
WindowSpec w = org.apache.spark.sql.expressions.Window.orderBy(colName);
Dataset<Row> leadDf = inputDSAAcolonly.withColumn("df1Rank", rank().over(w));
like image 114
mrsrinivas Avatar answered Oct 16 '22 20:10

mrsrinivas



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!