I have an input dataframe(ip_df), data in this dataframe looks like as below:
id col_value
1 10
2 11
3 12
Data type of id and col_value is String
I need to get another dataframe(output_df), having datatype of id as string and col_value column as decimal**(15,4)**. THere is no data transformation, just data type conversion. Can i use it using PySpark. Any help will be appreciated
Method 1: Using DataFrame.withColumn() We will make use of cast(x, dataType) method to casts the column to a different data type. Here, the parameter “x” is the column name and dataType is the datatype in which you want to change the respective column to.
You can't rename or change a column datatype in Databricks, only add new columns, reorder them or add column comments. To do this you must rewrite the table using the overwriteSchema option.
Try using the cast method:
from pyspark.sql.types import DecimalType
<your code>
output_df = ip_df.withColumn("col_value",ip_df["col_value"].cast(DecimalType()))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With