Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert String expression to actual working instance expression

I am trying to convert an expression in Scala that is saved in database as String back to working code.

I have tried Reflect Toolbox, Groovy, etc. But I can't seem to achieve what I require.

Here's what I tried:


import scala.reflect.runtime.universe._
import scala.reflect.runtime.currentMirror
import scala.tools.reflect.ToolBox

val toolbox = currentMirror.mkToolBox()
val code1 = q"""StructType(StructField(id,IntegerType,true), StructField(name,StringType,true), StructField(tstamp,TimestampType,true), StructField(date,DateType,true))"""
val sType = toolbox.compile(code1)().asInstanceOf[StructType]

where I need to use the sType instance for passing customSchema to csv file for dataframe creation but it seems to fail.

Is there any way I can get the string expression of the StructType to convert into actual StructType instance? Any help would be appreciated.

like image 261
Highdef Avatar asked Jan 27 '23 04:01

Highdef


2 Answers

If StructType is from Spark and you want to just convert String to StructType you don't need reflection. You can try this:

import org.apache.spark.sql.catalyst.parser.LegacyTypeStringParser
import org.apache.spark.sql.types.{DataType, StructType}

import scala.util.Try

def fromString(raw: String): StructType =
  Try(DataType.fromJson(raw)).getOrElse(LegacyTypeStringParser.parse(raw)) match {
    case t: StructType => t
    case _             => throw new RuntimeException(s"Failed parsing: $raw")
  }

val code1 =
  """StructType(Array(StructField(id,IntegerType,true), StructField(name,StringType,true), StructField(tstamp,TimestampType,true), StructField(date,DateType,true)))"""
fromString(code1) // res0: org.apache.spark.sql.types.StructType

The code is taken from the org.apache.spark.sql.types.StructType companion object from Spark. You cannot use it directly as it's in private package. Moreover, it uses LegacyTypeStringParser so I'm not sure if this is good enough for Production code.

like image 116
lukastymo Avatar answered Jan 31 '23 01:01

lukastymo


Your code inside quasiquotes, needs to be valid Scala syntax, so you need to provide quotes for strings. You'd also need to provide all the necessary imports. This works:

val toolbox = currentMirror.mkToolBox()
  val code1 =
    q"""
       //we need to import all sql types
       import org.apache.spark.sql.types._
       StructType(
           //StructType needs list
           List(
             //name arguments need to be in proper quotes
             StructField("id",IntegerType,true), 
             StructField("name",StringType,true),
             StructField("tstamp",TimestampType,true),
             StructField("date",DateType,true)
           )
       )
      """
val sType = toolbox.compile(code1)().asInstanceOf[StructType]

println(sType)

But maybe instead of trying to recompile the code, you should consider other alternatives as serializing struct type somehow (perhaps to JSON?).

like image 33
Krzysztof Atłasik Avatar answered Jan 31 '23 01:01

Krzysztof Atłasik