Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Scala/Spark App with "No TypeTag available" Error in "def main" style App

Tags:

I'm new to Scala/Spark stack and I'm trying to figure out how to test my basic skills using SparkSql to "map" RDDs in TempTables and viceversa.

I have 2 distinct .scala files with the same code: a simple object (with def main...) and an object extending App.

In the simple object one I get an error due to "No TypeTag available" connected to my case class Log:

object counter {
  def main(args: Array[String]) {
.
.
.
   val sqlContext = new org.apache.spark.sql.SQLContext(sc)
   import sqlContext.createSchemaRDD
   case class Log(visitatore: String, data: java.util.Date, pagina: String, count: Int)
   val log = triple.map(p => Log(p._1,p._2,p._3,p._4))
   log.registerTempTable("logs")
   val logSessioni= sqlContext.sql("SELECT visitor, data, pagina, count FROM logs")
   logSessioni.foreach(println)
}

The error at line: log.registerTempTable("logs") says "No TypeTag available for Log".

In the other file (object extends App) all works fine:

object counterApp extends App {
.
.
.
val sqlContext = new org.apache.spark.sql.SQLContext(sc)
    import sqlContext.createSchemaRDD
    case class Log(visitatore: String, data: java.util.Date, pagina: String, count: Int)
    val log = triple.map(p => Log(p._1,p._2,p._3,p._4))
    log.registerTempTable("logs")
    val logSessioni= sqlContext.sql("SELECT visitor, data, pagina, count from logs")
    logSessioni.foreach(println)
}

Since I've just started, I'm not getting two main points: 1) Why does the same code work fine in the second file (object extend App) while in the first one (simple object) I get the error?

2) (and most important) What should I do in my code (simple object file) to fix this error in order to deal with case class and TypeTag (which I barely know)?

Every answer, code examples will be much appreciated!

Thanks in advance

FF

like image 616
Fabio Fantoni Avatar asked Mar 19 '15 11:03

Fabio Fantoni


1 Answers

TL;DR;

Just move your case class out of the method definition

The problem is that your case class Log is defined inside of the method that it is being used. So, simply move your case class definition outside of the method and it will work. I will have to take a look at how this compiles down, but my guess is that this is more of a chicken-egg problem. The TypeTag (used for reflection) is not able to be implicitly defined as it has not been fully defined at that point. Here are two SO questions with the same problem that exhibit that Spark would need to use a WeakTypeTag. And, here is the JIRA explaining this more officially

like image 81
Justin Pihony Avatar answered Sep 20 '22 08:09

Justin Pihony