Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to drop a column from a Databricks Delta table?

I have recently started discovering Databricks and faced a situation where I need to drop a certain column of a delta table. When I worked with PostgreSQL it was as easy as

ALTER TABLE main.metrics_table 
DROP COLUMN metric_1;

I was looking through Databricks documentation on DELETE but it covers only DELETE the rows that match a predicate.

I've also found docs on DROP database, DROP function and DROP table but absolutely nothing on how to delete a column from a delta table. What am I missing here? Is there a standard way to drop a column from a delta table?

like image 714
samba Avatar asked Jan 31 '19 09:01

samba


People also ask

How do I drop a column in a table?

Right-click the column you want to delete and choose Delete Column from the shortcut menu. If the column participates in a relationship (FOREIGN KEY or PRIMARY KEY), a message prompts you to confirm the deletion of the selected columns and their relationships. Choose Yes.

Can we drop column in table?

SQL allows a user to remove one or more columns from a given table in the database if they are no longer needed or become redundant. To do so, the user must have ALTER permission on the object. Let's begin with the syntax for using the ALTER TABLE DROP COLUMN statement.

How do I add a column to an existing Delta table in Databricks?

ADD COLUMN Adds one or more columns to the table, or fields to existing columns in a Delta Lake table. { ADD [COLUMN | COLUMNS ] ( { { column_identifier | field_name } data_type [DEFAULT clause] [COMMENT comment] [FIRST | AFTER identifier] } [, ...] ) }

How do you drop a column in DDL?

SQL - Delete Columns from a Table. The ALTER command is a DDL command to modify the structure of existing tables in the database by adding, modifying, renaming, or dropping columns and constraints. Use the DROP keyword to delete one or more columns from a table.


1 Answers

use below code :

df = spark.sql("Select * from <DB Name>.<Table Name>")

df1 = df.drop("<Column Name>")

spark.sql("DROP TABLE if exists <DB Name>.<TableName>_OLD")

spark.sql("ALTER TABLE <DB Name>.<TableName> RENAME TO <DB Name>.<Table Name>_OLD ")

df1.write.format("delta").mode("OVERWRITE").option("overwriteSchema", "true").saveAsTable("<DB Name>.<Table Name>")
like image 172
Ardalan Shahgholi Avatar answered Nov 16 '22 02:11

Ardalan Shahgholi