Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Databricks drop a delta table?

How can I drop a Delta Table in Databricks? I can't find any information in the docs... maybe the only solution is to delete the files inside the folder 'delta' with the magic command or dbutils:

%fs rm -r delta/mytable?

EDIT:

For clarification, I put here a very basic example.

Example:

#create dataframe...
from pyspark.sql.types import *

cSchema = StructType([StructField("items", StringType())\
                      ,StructField("number", IntegerType())])

test_list = [['furniture', 1], ['games', 3]]

df = spark.createDataFrame(test_list,schema=cSchema)

and save it in a Delta table

df.write.format("delta").mode("overwrite").save("/delta/test_table")

Then, if I try to delete it.. it's not possible with drop table or similar action

%SQL
DROP TABLE 'delta.test_table'

neither other options like drop table 'delta/test_table', etc, etc...

like image 289
Joanteixi Avatar asked Nov 22 '19 09:11

Joanteixi


3 Answers

If you want to completely remove the table then a dbutils command is the way to go:

dbutils.fs.rm('/delta/test_table',recurse=True)

From my understanding the delta table you've saved is sitting within blob storage. Dropping the connected database table will drop it from the database, but not from storage.

like image 66
Papa_Helix Avatar answered Sep 20 '22 14:09

Papa_Helix


you can do that using sql command.

%sql
DROP TABLE IF EXISTS <database>.<table>
like image 45
Preeti Joshi Avatar answered Sep 19 '22 14:09

Preeti Joshi


Basically in databricks, Table are of 2 types - Managed and Unmanaged

1.Managed - tables for which Spark manages both the data and the metadata,Databricks stores the metadata and data in DBFS in your account.

2.Unmanaged - databricks just manage the meta data only but data is not managed by databricks.

so if you write a drop query for Managed tables it will drop the table and also delete the Data as well, but in case of Unmanaged tables if you write a drop query it will simply delete the sym-link pointer(Meta-information of table) to the table location but your data is not deleted, so you need to delete data externally using rm commands.

for more info: https://docs.databricks.com/data/tables.html

like image 32
Chandra Mouli Gupta Avatar answered Sep 20 '22 14:09

Chandra Mouli Gupta