Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to create a Row from a given case class?

Imagine that you have the following case classes:

case class B(key: String, value: Int)
case class A(name: String, data: B)

Given an instance of A, how do I create a Spark Row? e.g.

val a = A("a", B("b", 0))
val row = ???

NOTE: Given row I need to be able to get data with:

val name: String = row.getAs[String]("name")
val b: Row = row.getAs[Row]("data")
like image 328
Marsellus Wallace Avatar asked Feb 12 '18 20:02

Marsellus Wallace


1 Answers

The following seems to match what you're looking for.

scala> spark.version
res0: String = 2.3.0

scala> val a = A("a", B("b", 0))
a: A = A(a,B(b,0))

import org.apache.spark.sql.Encoders
val schema = Encoders.product[A].schema
scala> schema.printTreeString
root
 |-- name: string (nullable = true)
 |-- data: struct (nullable = true)
 |    |-- key: string (nullable = true)
 |    |-- value: integer (nullable = false)

val values = a.productIterator.toSeq.toArray

import org.apache.spark.sql.Row
import org.apache.spark.sql.catalyst.expressions.GenericRowWithSchema
val row: Row = new GenericRowWithSchema(values, schema)

scala> val name: String = row.getAs[String]("name")
name: String = a

// the following won't work since B =!= Row
scala> val b: Row = row.getAs[Row]("data")
java.lang.ClassCastException: B cannot be cast to org.apache.spark.sql.Row
  ... 55 elided
like image 70
Jacek Laskowski Avatar answered Oct 03 '22 06:10

Jacek Laskowski