Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Implementing a Cake Pattern with implicit functionality

I have a scenario where I want to implement a variant of a Cake Pattern, but adding implicit functionality to a class (a Spark DataFrame).

So, basically, I want to be able to run a code like the following:

trait Transformer {
  this: ColumnAdder =>

  def transform(input: DataFrame): DataFrame = {
    input.addColumn("newCol")
  }
}

val input = sqlContext.range(0, 5)
val transformer = new Transformer with StringColumnAdder
val output = transformer.transform(input)
output.show

And find a result like the following:

+---+------+
| id|newCol|
+---+------+
|  0|newCol|
|  1|newCol|
|  2|newCol|
|  3|newCol|
|  4|newCol|
+---+------+

My first idea was to define the implicit classes only in the base traits:

trait ColumnAdder {
  protected def _addColumn(df: DataFrame, colName: String): DataFrame

  implicit class ColumnAdderRichDataFrame(df: DataFrame) {
    def addColumn(colName: String): DataFrame = _addColumn(df, colName)
  }
}

trait StringColumnAdder extends ColumnAdder {
  protected def _addColumn(df: DataFrame, colName: String): DataFrame = {
    df.withColumn(colName, lit(colName))
  }
}

And it works, but I was not entirely happy with this approach, because of the function signatures duplication. So I thought of another approach, using the (deprecated?) implicit def strategy:

trait ColumnAdder {
  protected implicit def columnAdderImplicits(df: DataFrame): ColumnAdderDataFrame

  abstract class ColumnAdderDataFrame(df: DataFrame) {
    def addColumn(colName: String): DataFrame
  }
}

trait StringColumnAdder extends ColumnAdder {
  protected implicit def columnAdderImplicits(df: DataFrame): ColumnAdderDataFrame = new StringColumnAdderDataFrame(df)

  class StringColumnAdderDataFrame(df: DataFrame) extends ColumnAdderDataFrame(df) {
    def addColumn(colName: String): DataFrame = {
      df.withColumn(colName, lit(colName))
    }
  }
}

(The full reproducible code, including an extra trait-module can be found here)

So, I wanted to ask which approach is the best and if there may be another better way to achieve what I want.

like image 846
Daniel de Paula Avatar asked Jan 11 '17 13:01

Daniel de Paula


1 Answers

Just two shortcuts, but nothing really astonishing:

trait ColumnAdder {
  protected implicit def columnAdderImplicits(df: DataFrame): ColumnAdderDataFrame
  abstract class ColumnAdderDataFrame {
    def addColumn(colName: String): DataFrame
  }
}

trait StringColumnAdder extends ColumnAdder {
  override def columnAdderImplicits(df: DataFrame) =
    new ColumnAdderDataFrame {
      def addColumn(colName: String): DataFrame =
        df.withColumn(colName, lit(colName))
    }
}

If you willing to enable -language:reflectiveCalls (please be aware of implications) then you can also write:

trait ColumnAdder {
  protected implicit def columnAdderImplicits(df: DataFrame): {
    def addColumn(colName: String): DataFrame
  }
}

trait StringColumnAdder extends ColumnAdder {
  override def columnAdderImplicits(df: DataFrame) = new {
    def addColumn(colName: String): DataFrame =
      df.withColumn(colName, lit(colName))
  }
}
like image 162
Federico Pellegatta Avatar answered Nov 10 '22 01:11

Federico Pellegatta