I'm new to Scala/Spark stack and I'm trying to figure out how to test my basic skills using SparkSql to "map" RDDs in TempTables and viceversa.
I have 2 distinct .scala files with the same code: a simple object (with def main...) and an object extending App.
In the simple object one I get an error due to "No TypeTag available" connected to my case class Log:
object counter {
def main(args: Array[String]) {
.
.
.
val sqlContext = new org.apache.spark.sql.SQLContext(sc)
import sqlContext.createSchemaRDD
case class Log(visitatore: String, data: java.util.Date, pagina: String, count: Int)
val log = triple.map(p => Log(p._1,p._2,p._3,p._4))
log.registerTempTable("logs")
val logSessioni= sqlContext.sql("SELECT visitor, data, pagina, count FROM logs")
logSessioni.foreach(println)
}
The error at line: log.registerTempTable("logs")
says "No TypeTag available for Log".
In the other file (object extends App) all works fine:
object counterApp extends App {
.
.
.
val sqlContext = new org.apache.spark.sql.SQLContext(sc)
import sqlContext.createSchemaRDD
case class Log(visitatore: String, data: java.util.Date, pagina: String, count: Int)
val log = triple.map(p => Log(p._1,p._2,p._3,p._4))
log.registerTempTable("logs")
val logSessioni= sqlContext.sql("SELECT visitor, data, pagina, count from logs")
logSessioni.foreach(println)
}
Since I've just started, I'm not getting two main points: 1) Why does the same code work fine in the second file (object extend App) while in the first one (simple object) I get the error?
2) (and most important) What should I do in my code (simple object file) to fix this error in order to deal with case class and TypeTag (which I barely know)?
Every answer, code examples will be much appreciated!
Thanks in advance
FF
TL;DR;
Just move your case class out of the method definition
The problem is that your case class Log
is defined inside of the method that it is being used. So, simply move your case class definition outside of the method and it will work. I will have to take a look at how this compiles down, but my guess is that this is more of a chicken-egg problem. The TypeTag
(used for reflection) is not able to be implicitly defined as it has not been fully defined at that point. Here are two SO questions with the same problem that exhibit that Spark would need to use a WeakTypeTag
. And, here is the JIRA explaining this more officially
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With