How do you rename a column in Databricks?
The following does not work:
ALTER TABLE mySchema.myTable change COLUMN old_name new_name int
It returns the error:
ALTER TABLE CHANGE COLUMN is not supported for changing column 'old_name' with type 'IntegerType >(nullable = true)' to 'new_name' with type 'IntegerType (nullable = true)';
If it makes a difference, this table is using Delta Lake, and it is NOT partitioned or z-ordered by this "old_name" column.
Spark has a withColumnRenamed() function on DataFrame to change a column name. This is the most straight forward approach; this function takes two parameters; the first is your existing column name and the second is the new column name you wish for. Returns a new DataFrame (Dataset[Row]) with a column renamed.
1. Renaming a column name using the ALTER keyword. Line 2: RENAME COLUMN OldColumnName TO NewColumnName; For Example: Write a query to rename the column name “SID” to “StudentsID”.
RENAME TO to_view_name Renames the existing view within the schema. to_view_name specifies the new name of the view. If the to_view_name already exists, a TableAlreadyExistsException is thrown. If to_view_name is qualified it must match the schema name of view_name .
You can't rename or change a column datatype in Databricks, only add new columns, reorder them or add column comments. To do this you must rewrite the table using the overwriteSchema
option.
Take this example below from this documentation:
spark.read.table(...)
.withColumnRenamed("date", "date_created")
.write
.mode("overwrite")
.option("overwriteSchema", "true")
.table(...)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With