I am working with spark and scala and saw the following in the online docs
df.select($"name", $"age" + 1).show()
What does $"name" mean here?
This is not a scala thing.
scala> val name = "something"
name: String = something
scala> println($"name")
<console>:12: error: value $ is not a member of StringContext
println($"name")
^
Rather a spark thing, which seems to represent a column.
See the org.apache.spark.sql.SQLImplicits code over here,
implicit class StringToColumn(val sc: StringContext) {
def $(args: Any*): ColumnName = {
new ColumnName(sc.s(args: _*))
}
}
You could simply do dataframe.select("columnname").show
or dataframe.select(col("columnname")).show
too, but dataframe.select($"columnname")
to mutate the column value, like you are incrementing age in your example.
eg.
Given a dataframe,
+----+-------+
| age| name|
+----+-------+
|null|Michael|
| 30| Andy|
| 19| Justin|
+----+-------+
scala> dataframe.select($"name".as('myname)).show()
+-------+
| myname|
+-------+
|Michael|
| Andy|
| Justin|
+-------+
scala> dataframe.select("age"+1).show()
org.apache.spark.sql.AnalysisException: cannot resolve '`age1`' given input columns: [age, name];;
'Project ['age1]
Other example to use $
could be filter based on column values,
dataframe.filter($"age" > 28).show()
So, basically, you are making it a variable(of type Column
) with $""
in Spark.
Scala has ${}
while concatenating the variables, (aka String interpolation
)
scala> val printMe = "prayagupd"
printMe: String = prayagupd
scala> println(s"value = $printMe")
value = prayagupd
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With