Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to create schema (StructType) with one or more StructTypes?

I am trying to create a StructType inside another StructType, but it only allows to add a StructField. I can't find any method to add StructType to it.

How to create StructType schema for the below string representation?

struct<abc:struct<name:string>,pqr:struct<address:string>>
like image 507
Rahul Sharma Avatar asked Oct 04 '17 15:10

Rahul Sharma


2 Answers

There's this hidden feature of Spark SQL to define a schema using so-called Schema DSL (i.e. without many round brackets and alike).

import org.apache.spark.sql.types._
val name = new StructType().add($"name".string)
scala> println(name.simpleString)
struct<name:string>

val address = new StructType().add($"address".string)
scala> println(address.simpleString)
struct<address:string>

val schema = new StructType().add("abc", name).add("pqr", address)
scala> println(schema.simpleString)
struct<abc:struct<name:string>,pqr:struct<address:string>>

scala> schema.simpleString == "struct<abc:struct<name:string>,pqr:struct<address:string>>"
res4: Boolean = true

scala> schema.printTreeString
root
 |-- abc: struct (nullable = true)
 |    |-- name: string (nullable = true)
 |-- pqr: struct (nullable = true)
 |    |-- address: string (nullable = true)
like image 134
Jacek Laskowski Avatar answered Sep 27 '22 21:09

Jacek Laskowski


structField is a combination of a type and a name so you would do:

StructType(Seq(StructField("structName", StructType(Seq(StructField("name", StringType), StructField("address", StringType))))
like image 26
Assaf Mendelson Avatar answered Sep 27 '22 22:09

Assaf Mendelson