Imagine that you have the following case classes:
case class B(key: String, value: Int)
case class A(name: String, data: B)
Given an instance of A
, how do I create a Spark Row
? e.g.
val a = A("a", B("b", 0))
val row = ???
NOTE: Given row
I need to be able to get data with:
val name: String = row.getAs[String]("name")
val b: Row = row.getAs[Row]("data")
The following seems to match what you're looking for.
scala> spark.version
res0: String = 2.3.0
scala> val a = A("a", B("b", 0))
a: A = A(a,B(b,0))
import org.apache.spark.sql.Encoders
val schema = Encoders.product[A].schema
scala> schema.printTreeString
root
|-- name: string (nullable = true)
|-- data: struct (nullable = true)
| |-- key: string (nullable = true)
| |-- value: integer (nullable = false)
val values = a.productIterator.toSeq.toArray
import org.apache.spark.sql.Row
import org.apache.spark.sql.catalyst.expressions.GenericRowWithSchema
val row: Row = new GenericRowWithSchema(values, schema)
scala> val name: String = row.getAs[String]("name")
name: String = a
// the following won't work since B =!= Row
scala> val b: Row = row.getAs[Row]("data")
java.lang.ClassCastException: B cannot be cast to org.apache.spark.sql.Row
... 55 elided
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With