How to perform update in Apache Spark SQL

Question

I have to update a JavaSchemaRDD with some new values by having some WHERE conditions.

This is the SQL query which I want to convert into Spark SQL:

UPDATE t1
  SET t1.column1 = '0', t1.column2 = 1, t1.column3 = 1    
  FROM TABLE1 t1
  INNER JOIN TABLE2 t2 ON t1.id_column = t2.id_column     
  WHERE (t2.column1 = 'A') AND (t2.column2 > 0)

Shekar Patel · Accepted Answer

Yup I got solution my self. I have achieved this using Spark core only, I have not used Spark-Sql for this. I have 2 RDD's (also can be called as tables or datasets) t1 and t2. If we observe my query in the question I am updating t1 based on one join condition and two where conditions. Meaning I need three columns(id_column, column1 and column2) from t2. So I have taken these columns in to 3 individual collections. And then I put an iteration over 1st RDD t1 and during the iteration I have added those three condition statements(1 Join and 2 where conditions) using java "if" conditions. So based on "if" conditions result first RDD values got updated.

How to perform update in Apache Spark SQL

Tags:

join

sql-update

apache-spark

apache-spark-sql

Shekar Patel

1 Answers

Shekar Patel

Recent Activity

Donate For Us

How to perform update in Apache Spark SQL

Tags:

join

sql-update

apache-spark

apache-spark-sql

Shekar Patel

1 Answers

Shekar Patel

Related questions

Recent Activity

Donate For Us