I have a dataframe called df with column named employee_id. I am doing:
df.registerTempTable("d_f")
val query = """SELECT *, ROW_NUMBER() OVER (ORDER BY employee_id) row_number FROM d_f"""
val result = Spark.getSqlContext().sql(query)
But getting following issue. Any help?
[1.29] failure: ``union'' expected but `(' found
SELECT *, ROW_NUMBER() OVER (ORDER BY employee_id) row_number FROM d_f
^
java.lang.RuntimeException: [1.29] failure: ``union'' expected but `(' found
SELECT *, ROW_NUMBER() OVER (ORDER BY employee_id) row_number FROM d_f
Spark 2.0+
Spark 2.0 introduces native implementation of window functions (SPARK-8641) so HiveContext
should be no longer required. Nevertheless similar errors, not related to window functions, can be still attributed to the differences between SQL parsers.
Spark <= 1.6
Window functions have been introduced in Spark 1.4.0 and require HiveContext
to work. SQLContext
won't work here.
Be sure you you use Spark >= 1.4.0 and create the HiveContext
:
import org.apache.spark.sql.hive.HiveContext
val sqlContext = new HiveContext(sc)
Yes It is true,
I am using spark version 1.6.0 and there you need a HiveContext to implement 'dense_rank' method.
From Spark 2.0.0 on words there will be no more 'dense_rank' method.
So for Spark 1.4,1.6 <2.0 you should apply like this.
table hive_employees having three fields :: place : String, name : String, salary : Int
val conf = new SparkConf().setAppName("denseRank test")//.setMaster("local")
val sc = new SparkContext(conf)
val sqlContext = new SQLContext(sc)
val hqlContext = new org.apache.spark.sql.hive.HiveContext(sc)
val result = hqlContext.sql("select empid,empname, dense_rank() over(partition by empsalary order by empname) as rank from hive_employees")
result.show()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With