How do you create a dataframe containing nulls from a sequence using .toDF ?
This works:
val df = Seq((1,"a"),(2,"b")).toDF("number","letter")
but I'd like to do something along the lines of:
val df = Seq((1, NULL),(2,"b")).toDF("number","letter")
In addition to Ramesh's answer it's worth noting that since toDF
uses reflection to infer the schema it's important for the provided sequence to have a correct type. And if scala's type inference isn't enough you need to specify the type explicitly.
For example if you want 2nd column to be nullable integer then neither of the following works:
Seq((1, null))
has inferred type Seq[(Int, Null)]
Seq((1, null), (2, 2))
has inferred type Seq[(Int, Any)]
In this case you need to explicitly specify the type for the 2nd column. There are at least two ways how to do it. You can explicitly specify the generic type for the sequence
Seq[(Int, Integer)]((1, null)).toDF
or create a case class for the row:
case class MyRow(x: Int, y: Integer)
Seq(MyRow(1, null)).toDF
Note that I used Integer
instead of Int
as the later being a primitive type cannot accommodate nulls.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With