Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

spark-shell: strange behavior with import

I am working in a spark-shell (Spark version 2.1.0, Using Scala version 2.11.8, OpenJDK 64-Bit Server VM, 1.7.0_151).

I import Column class:

scala> import org.apache.spark.sql.Column
import org.apache.spark.sql.Column

I can define a Column object:

scala> val myCol: Column = col("blah")
myCol: org.apache.spark.sql.Column = blah

and use Column in a function definition:

scala> def myFunc(c: Column) = ()
myFunc: (c: org.apache.spark.sql.Column)Unit

So far so good. But when defining a class, Columnis not found:

scala> case class myClass(c: Column)
<console>:11: error: not found: type Column
       case class myClass(c: Column)

However a one-liner works:

scala> case class myClass(c: org.apache.spark.sql.Column)
defined class myClass

or

scala> import org.apache.spark.sql.Column; case class myClass(c: Column)
import org.apache.spark.sql.Column
defined class myClass
like image 885
ivankeller Avatar asked Jan 30 '18 15:01

ivankeller


1 Answers

This is Spark issue.
It works in Spark 1.6, but the issue is still present in Spark 2.1.0 or higher.

Root cause:

Classes defined in Shell are inner classes, and therefore cannot be easily instantiated by reflection. They need an additional reference to the outer object, which is non-trivial to obtain.

As a workaround try to use :paste in spark-shell.

like image 95
Yehor Krivokon Avatar answered Sep 28 '22 00:09

Yehor Krivokon