Well, I'm using PySpark and I have a Spark dataframe using which I insert the data into a mysql table.
url = "jdbc:mysql://hostname/myDB?user=xyz&password=pwd"
df.write.jdbc(url=url, table="myTable", mode="append")
I want to update a column value (which is not in primary key) by the sum of its column value and a specific number.
I've tried with different modes (append, overwrite) DataFrameWriter.jdbc() function.
My question is how do we update a column value as in we do it with ON DUPLICATE KEY UPDATE
in mysql, while inserting the pyspark dataframe data into a table.
A workaround is to insert the data into a staging table, and then migrate it into the final tables using a SQL statement executed by the driver program. Than you can use any valid SQL syntax relevant to your database provider.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With