Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Generate a Spark StructType / Schema from a case class

If I wanted to create a StructType (i.e. a DataFrame.schema) out of a case class, is there a way to do it without creating a DataFrame? I can easily do:

case class TestCase(id: Long) val schema = Seq[TestCase]().toDF.schema 

But it seems overkill to actually create a DataFrame when all I want is the schema.

(If you are curious, the reason behind the question is that I am defining a UserDefinedAggregateFunction, and to do so you override a couple of methods that return StructTypes and I use case classes.)

like image 425
David Griffin Avatar asked Apr 20 '16 13:04

David Griffin


2 Answers

You can do it the same way SQLContext.createDataFrame does it:

import org.apache.spark.sql.catalyst.ScalaReflection val schema = ScalaReflection.schemaFor[TestCase].dataType.asInstanceOf[StructType] 
like image 70
Tzach Zohar Avatar answered Oct 14 '22 21:10

Tzach Zohar


I know this question is almost a year old but I came across it and thought others who do also might want to know that I have just learned to use this approach:

import org.apache.spark.sql.Encoders val mySchema = Encoders.product[MyCaseClass].schema 
like image 28
Kurt Avatar answered Oct 14 '22 22:10

Kurt