Azure Databricks - Can not create the managed table The associated location already exists

Question

I have the following problem in Azure Databricks. Sometimes when I try to save a DataFrame as a managed table:

SomeData_df.write.mode('overwrite').saveAsTable("SomeData")

I get the following error:

"Can not create the managed table('SomeData'). The associated location('dbfs:/user/hive/warehouse/somedata') already exists.;"

I used to fix this problem by running a %fs rm command to remove that location but now I'm using a cluster that is managed by a different user and I can no longer run rm on that location.

For now the only fix I can think of is using a different table name.

What makes things even more peculiar is the fact that the table does not exist. When I run:

%sql
SELECT * FROM SomeData

I get the error:

Error in SQL statement: AnalysisException: Table or view not found: SomeData;

How can I fix it?

char · Accepted Answer

Seems there are a few others with the same issue.

A temporary workaround is to use

dbutils.fs.rm("dbfs:/user/hive/warehouse/SomeData/", true)

to remove the table before re-creating it.

Mike · Answer

This generally happens when a cluster is shutdown while writing a table. The recomended solution from Databricks documentation:

This flag deletes the _STARTED directory and returns the process to the original state. For example, you can set it in the notebook

%py
spark.conf.set("spark.sql.legacy.allowCreatingManagedTableUsingNonemptyLocation","true")

Azure Databricks - Can not create the managed table The associated location already exists

Tags:

apache-spark

hive

databricks

azure-data-lake

azure-databricks

BuahahaXD

2 Answers

char

Mike

Recent Activity

Donate For Us

Azure Databricks - Can not create the managed table The associated location already exists

Tags:

apache-spark

hive

databricks

azure-data-lake

azure-databricks

BuahahaXD

2 Answers

char

Mike

Related questions

Recent Activity

Donate For Us