I have DataFrame with following schema:
|-- data: struct (nullable = true)
| |-- asin: string (nullable = true)
| |-- customerId: long (nullable = true)
| |-- eventTime: long (nullable = true)
| |-- marketplaceId: long (nullable = true)
| |-- rating: long (nullable = true)
| |-- region: string (nullable = true)
| |-- type: string (nullable = true)
|-- uploadedDate: long (nullable = true)
I want to explode the struct such that all elements like asin, customerId, eventTime become the columns in DataFrame. I tried explode function but it works on Array not on struct type. Is it possible to convert the able data frame to below dataframe:
|-- asin: string (nullable = true)
|-- customerId: long (nullable = true)
|-- eventTime: long (nullable = true)
|-- marketplaceId: long (nullable = true)
|-- rating: long (nullable = true)
|-- region: string (nullable = true)
|-- type: string (nullable = true)
|-- uploadedDate: long (nullable = true)
It's quite simple:
val newDF = df.select("uploadedDate", "data.*");
You tell to select uploadedDate and then all subelements of field data
Example:
scala> case class A(a: Int, b: Double)
scala> val df = Seq((A(1, 1.0), "1"), (A(2, 2.0), "2")).toDF("data", "uploadedDate")
scala> val newDF = df.select("uploadedDate", "data.*")
scala> newDF.show()
+------------+---+---+
|uploadedDate| a| b|
+------------+---+---+
| 1| 1|1.0|
| 2| 2|2.0|
+------------+---+---+
scala> newDF.printSchema()
root
|-- uploadedDate: string (nullable = true)
|-- a: integer (nullable = true)
|-- b: double (nullable = true)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With