Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use CROSS JOIN and CROSS APPLY in Spark SQL

I am very new to Spark and Scala, I writing Spark SQL code. I am in situation to apply CROSS JOIN and CROSS APPLY in my logic. Here I will post the SQL query which I have to convert to spark SQL.

select Table1.Column1,Table2.Column2,Table3.Column3
from Table1 CROSS JOIN Table2 CROSS APPLY Table3

I need the above query to convert in to SQLContext in Spark SQL. Kindly help me. Thanks in Advance.

like image 572
Miruthan Avatar asked Dec 15 '22 02:12

Miruthan


2 Answers

First set the below property in spark conf

spark.sql.crossJoin.enabled=true

then dataFrame1.join(dataFrame2) will do Cross/Cartesian join,

we can use below query also for doing the same

sqlContext.sql("select * from table1 CROSS JOIN table2 CROSS JOIN table3...")
like image 92
SanthoshPrasad Avatar answered Dec 21 '22 11:12

SanthoshPrasad


Set Spark Configuration ,

var sparkConf: SparkConf = null

 sparkConf = new SparkConf()

.set("spark.sql.crossJoin.enabled", "true")

Explicit Cross Join in spark 2.x using crossJoin Method

crossJoin(right: Dataset[_]): DataFrame

var df_new = df1.crossJoin(df2);

Note : Cross joins are one of the most time consuming joins and often should be avoided.

like image 35
Swadeshi Avatar answered Dec 21 '22 10:12

Swadeshi