I have two spark's data frames. One of them recieved from hive table using HiveContext:
spark_df1 = hc.sql("select * from testdb.titanic_pure_data_test")
Second spark's dataframe I got from .csv
file:
lines = sc.textFile("hdfs://HDFS-1/home/testdb/1500000_Sales_Records.csv").map(lambda line: line.split(","))
spark_df_test = lines.toDF(['Region','Country','Item_Type','Sales_Channel','Order_Priority','Order_Date','Order_ID','Ship_Date','Units_Sold','Unit_Price','Unit_Cost','Total_Revenue','Total_Cost','Total_Profit'])`
I want to save any dataframe as hive table
spark_df1.write.mode("overwrite").format("orc").saveAsTable("testdb.new_res5")
The first dataframe saved without problems, but when I try to save second dataframe (spark_df_test
) in the same way, I got this error
File "/home/jup-user/testdb/scripts/caching.py", line 90, in spark_df_test.write.mode("overwrite").format("orc").saveAsTable("
testdb
.new_res5
") File "/data_disk/opt/cloudera/parcels/CDH-5.15.1-1.cdh5.15.1.p0.4/lib/spark/python/lib/pyspark.zip/pyspark/sql/readwriter.py", line 435, in saveAsTable File "/data_disk/opt/cloudera/parcels/CDH-5.15.1-1.cdh5.15.1.p0.4/lib/spark/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1257, in call File "/data_disk/opt/cloudera/parcels/CDH-5.15.1-1.cdh5.15.1.p0.4/lib/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", line 51, in deco pyspark.sql.utils.AnalysisException: 'Specifying database name or other qualifiers are not allowed for temporary tables. If the table name has dots (.) in it, please quote the table name with backticks (`).;'
The problem is you are trying to overwrite the same hive table with the different dataframe. This can't be done right now in spark.
The reason is the following code. This ensures if the table exists to throw an exception. The ideal way is to save the dataframe in a new table
spark_df_test.write.mode("overwrite").format("orc").saveAsTable("testdb.new_res6")
Or you can use 'insertInto'
spark_df_test.write.mode("overwrite").saveAsTable("temp_table")
Then you can overwrite rows in your target table
val tempTable = sqlContext.table("temp_table")
tempTable
.write
.mode("overwrite").insertInto("testdb.new_res5")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With