The reduceLeft function is applicable to both Scala's Mutable and Immutable collection data structures. The reduceLeft method takes an associative binary operator function as parameter and will use it to collapse elements from the collection.
foldLeft() method is a member of TraversableOnce trait, it is used to collapse elements of collections. It navigates elements from Left to Right order. It is primarily used in recursive functions and prevents stack overflow exceptions.
Fold and reduce The difference between the two functions is that fold() takes an initial value and uses it as the accumulated value on the first step, whereas the first step of reduce() uses the first and the second elements as operation arguments on the first step.
The Scala foldLeft method can be used to iterate over a data structure and perform multiple operations on a Spark DataFrame. foldLeft can be used to eliminate all whitespace in multiple columns or convert all the column names in a DataFrame to snake_case.
Few things to mention here, before giving the actual answer:
left
, it's rather about the difference between reducing and foldingBack to your question:
Here is the signature of foldLeft
(could also have been foldRight
for the point I'm going to make):
def foldLeft [B] (z: B)(f: (B, A) => B): B
And here is the signature of reduceLeft
(again the direction doesn't matter here)
def reduceLeft [B >: A] (f: (B, A) => B): B
These two look very similar and thus caused the confusion. reduceLeft
is a special case of foldLeft
(which by the way means that you sometimes can express the same thing by using either of them).
When you call reduceLeft
say on a List[Int]
it will literally reduce the whole list of integers into a single value, which is going to be of type Int
(or a supertype of Int
, hence [B >: A]
).
When you call foldLeft
say on a List[Int]
it will fold the whole list (imagine rolling a piece of paper) into a single value, but this value doesn't have to be even related to Int
(hence [B]
).
Here is an example:
def listWithSum(numbers: List[Int]) = numbers.foldLeft((List.empty[Int], 0)) {
(resultingTuple, currentInteger) =>
(currentInteger :: resultingTuple._1, currentInteger + resultingTuple._2)
}
This method takes a List[Int]
and returns a Tuple2[List[Int], Int]
or (List[Int], Int)
. It calculates the sum and returns a tuple with a list of integers and it's sum. By the way the list is returned backwards, because we used foldLeft
instead of foldRight
.
Watch One Fold to rule them all for a more in depth explanation.
reduceLeft
is just a convenience method. It is equivalent to
list.tail.foldLeft(list.head)(_)
foldLeft
is more generic, you can use it to produce something completely different than what you originally put in. Whereas reduceLeft
can only produce an end result of the same type or super type of the collection type. For example:
List(1,3,5).foldLeft(0) { _ + _ }
List(1,3,5).foldLeft(List[String]()) { (a, b) => b.toString :: a }
The foldLeft
will apply the closure with the last folded result (first time using initial value) and the next value.
reduceLeft
on the other hand will first combine two values from the list and apply those to the closure. Next it will combine the rest of the values with the cumulative result. See:
List(1,3,5).reduceLeft { (a, b) => println("a " + a + ", b " + b); a + b }
If the list is empty foldLeft
can present the initial value as a legal result. reduceLeft
on the other hand does not have a legal value if it can't find at least one value in the list.
For reference, reduceLeft
will error if applied to an empty container with the following error.
java.lang.UnsupportedOperationException: empty.reduceLeft
Reworking the code to use
myList foldLeft(List[String]()) {(a,b) => a+b}
is one potential option. Another is to use the reduceLeftOption
variant which returns an Option wrapped result.
myList reduceLeftOption {(a,b) => a+b} match {
case None => // handle no result as necessary
case Some(v) => println(v)
}
The basic reason they are both in Scala standard library is probably because they are both in Haskell standard library (called foldl
and foldl1
). If reduceLeft
wasn't, it would quite often be defined as a convenience method in different projects.
From Functional Programming Principles in Scala (Martin Odersky):
The function
reduceLeft
is defined in terms of a more general function,foldLeft
.
foldLeft
is likereduceLeft
but takes an accumulatorz
, as an additional parameter, which is returned whenfoldLeft
is called on an empty list:
(List (x1, ..., xn) foldLeft z)(op) = (...(z op x1) op ...) op x
[as opposed to reduceLeft
, which throws an exception when called on an empty list.]
The course (see lecture 5.5) provides abstract definitions of these functions, which illustrates their differences, although they are very similar in their use of pattern matching and recursion.
abstract class List[T] { ...
def reduceLeft(op: (T,T)=>T) : T = this match{
case Nil => throw new Error("Nil.reduceLeft")
case x :: xs => (xs foldLeft x)(op)
}
def foldLeft[U](z: U)(op: (U,T)=>U): U = this match{
case Nil => z
case x :: xs => (xs foldLeft op(z, x))(op)
}
}
Note that foldLeft
returns a value of type U
, which is not necessarily the same type as List[T]
, but reduceLeft returns a value of the same type as the list).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With