I'm new to both Scala and Spark. Could anyone explain what's the meaning of
rdd.map(_.swap)
? If I look in Scala/Spark API I cannot find swap
as a method in RDD class.
swap
is a method on Scala Tuples. It swaps the first and second elements of a Tuple2 (or pair) with each other. For example:
scala> val pair = ("a","b")
pair: (String, String) = (a,b)
scala> val swapped = pair.swap
swapped: (String, String) = (b,a)
RDD's map
function applies a given function to each element of the RDD. In this case, the function to be applied to each element is simply
_.swap
The underscore in this case is shorthand in Scala when writing anonymous functions, and it pertains to the parameter passed in to your function without naming it. So the above snippet can be rewritten into something like:
rdd.map{ pair => pair.swap }
So the code snippet you posted swaps the first and second elements of the tuple/pair in each row of the RDD.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With