I have a scenario where I want to implement a variant of a Cake Pattern, but adding implicit functionality to a class (a Spark DataFrame).
So, basically, I want to be able to run a code like the following:
trait Transformer {
this: ColumnAdder =>
def transform(input: DataFrame): DataFrame = {
input.addColumn("newCol")
}
}
val input = sqlContext.range(0, 5)
val transformer = new Transformer with StringColumnAdder
val output = transformer.transform(input)
output.show
And find a result like the following:
+---+------+
| id|newCol|
+---+------+
| 0|newCol|
| 1|newCol|
| 2|newCol|
| 3|newCol|
| 4|newCol|
+---+------+
My first idea was to define the implicit classes only in the base traits:
trait ColumnAdder {
protected def _addColumn(df: DataFrame, colName: String): DataFrame
implicit class ColumnAdderRichDataFrame(df: DataFrame) {
def addColumn(colName: String): DataFrame = _addColumn(df, colName)
}
}
trait StringColumnAdder extends ColumnAdder {
protected def _addColumn(df: DataFrame, colName: String): DataFrame = {
df.withColumn(colName, lit(colName))
}
}
And it works, but I was not entirely happy with this approach, because of the function signatures duplication. So I thought of another approach, using the (deprecated?) implicit def
strategy:
trait ColumnAdder {
protected implicit def columnAdderImplicits(df: DataFrame): ColumnAdderDataFrame
abstract class ColumnAdderDataFrame(df: DataFrame) {
def addColumn(colName: String): DataFrame
}
}
trait StringColumnAdder extends ColumnAdder {
protected implicit def columnAdderImplicits(df: DataFrame): ColumnAdderDataFrame = new StringColumnAdderDataFrame(df)
class StringColumnAdderDataFrame(df: DataFrame) extends ColumnAdderDataFrame(df) {
def addColumn(colName: String): DataFrame = {
df.withColumn(colName, lit(colName))
}
}
}
(The full reproducible code, including an extra trait-module can be found here)
So, I wanted to ask which approach is the best and if there may be another better way to achieve what I want.
Just two shortcuts, but nothing really astonishing:
trait ColumnAdder {
protected implicit def columnAdderImplicits(df: DataFrame): ColumnAdderDataFrame
abstract class ColumnAdderDataFrame {
def addColumn(colName: String): DataFrame
}
}
trait StringColumnAdder extends ColumnAdder {
override def columnAdderImplicits(df: DataFrame) =
new ColumnAdderDataFrame {
def addColumn(colName: String): DataFrame =
df.withColumn(colName, lit(colName))
}
}
If you willing to enable -language:reflectiveCalls
(please be aware of implications) then you can also write:
trait ColumnAdder {
protected implicit def columnAdderImplicits(df: DataFrame): {
def addColumn(colName: String): DataFrame
}
}
trait StringColumnAdder extends ColumnAdder {
override def columnAdderImplicits(df: DataFrame) = new {
def addColumn(colName: String): DataFrame =
df.withColumn(colName, lit(colName))
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With