I wonder can I use the update query in sparksql just like:
sqlContext.sql("update users set name = '*' where name is null")
I got the error:
org.apache.spark.sql.AnalysisException:
Unsupported language features in query:update users set name = '*' where name is null
If the sparksql does not support the update query or am i writing the code incorrectly?
In Spark, updating the DataFrame can be done by using withColumn() transformation function, In this article, I will explain how to update or change the DataFrame column. I will also explain how to update the column based on condition.
You can do update a PySpark DataFrame Column using withColum(), select() and sql(), since DataFrame's are distributed immutable collection you can't really change the column values however when you change the value using withColumn() or any approach, PySpark returns a new Dataframe with updated values.
One possible approach to insert or update records in the database from Spark Dataframe is to first write the dataframe to a csv file. Next, the csv can be streamed (to prevent out-of-memory error if the csv file is too large).
Update a table. You can update data that matches a predicate in a Delta table. For example, to fix a spelling mistake in the eventType , you can run the following: Scala.
Spark SQL now supports update, delete and such data modification operations if the underlying table is in delta format.
Check this out: https://docs.delta.io/0.4.0/delta-update.html#update-a-table
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With