Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

difference between foldLeft and reduceLeft in Scala

People also ask

What is reduceLeft in Scala?

The reduceLeft function is applicable to both Scala's Mutable and Immutable collection data structures. The reduceLeft method takes an associative binary operator function as parameter and will use it to collapse elements from the collection.

What is foldLeft in Scala?

foldLeft() method is a member of TraversableOnce trait, it is used to collapse elements of collections. It navigates elements from Left to Right order. It is primarily used in recursive functions and prevents stack overflow exceptions.

What is the difference between fold and reduce?

Fold and reduce The difference between the two functions is that fold() takes an initial value and uses it as the accumulated value on the first step, whereas the first step of reduce() uses the first and the second elements as operation arguments on the first step.

What is foldLeft in spark?

The Scala foldLeft method can be used to iterate over a data structure and perform multiple operations on a Spark DataFrame. foldLeft can be used to eliminate all whitespace in multiple columns or convert all the column names in a DataFrame to snake_case.


Few things to mention here, before giving the actual answer:

  • Your question doesn't have anything to do with left, it's rather about the difference between reducing and folding
  • The difference is not the implementation at all, just look at the signatures.
  • The question doesn't have anything to do with Scala in particular, it's rather about the two concepts of functional programming.

Back to your question:

Here is the signature of foldLeft (could also have been foldRight for the point I'm going to make):

def foldLeft [B] (z: B)(f: (B, A) => B): B

And here is the signature of reduceLeft (again the direction doesn't matter here)

def reduceLeft [B >: A] (f: (B, A) => B): B

These two look very similar and thus caused the confusion. reduceLeft is a special case of foldLeft (which by the way means that you sometimes can express the same thing by using either of them).

When you call reduceLeft say on a List[Int] it will literally reduce the whole list of integers into a single value, which is going to be of type Int (or a supertype of Int, hence [B >: A]).

When you call foldLeft say on a List[Int] it will fold the whole list (imagine rolling a piece of paper) into a single value, but this value doesn't have to be even related to Int (hence [B]).

Here is an example:

def listWithSum(numbers: List[Int]) = numbers.foldLeft((List.empty[Int], 0)) {
   (resultingTuple, currentInteger) =>
      (currentInteger :: resultingTuple._1, currentInteger + resultingTuple._2)
}

This method takes a List[Int] and returns a Tuple2[List[Int], Int] or (List[Int], Int). It calculates the sum and returns a tuple with a list of integers and it's sum. By the way the list is returned backwards, because we used foldLeft instead of foldRight.

Watch One Fold to rule them all for a more in depth explanation.


reduceLeft is just a convenience method. It is equivalent to

list.tail.foldLeft(list.head)(_)

foldLeft is more generic, you can use it to produce something completely different than what you originally put in. Whereas reduceLeft can only produce an end result of the same type or super type of the collection type. For example:

List(1,3,5).foldLeft(0) { _ + _ }
List(1,3,5).foldLeft(List[String]()) { (a, b) => b.toString :: a }

The foldLeft will apply the closure with the last folded result (first time using initial value) and the next value.

reduceLeft on the other hand will first combine two values from the list and apply those to the closure. Next it will combine the rest of the values with the cumulative result. See:

List(1,3,5).reduceLeft { (a, b) => println("a " + a + ", b " + b); a + b }

If the list is empty foldLeft can present the initial value as a legal result. reduceLeft on the other hand does not have a legal value if it can't find at least one value in the list.


For reference, reduceLeft will error if applied to an empty container with the following error.

java.lang.UnsupportedOperationException: empty.reduceLeft

Reworking the code to use

myList foldLeft(List[String]()) {(a,b) => a+b}

is one potential option. Another is to use the reduceLeftOption variant which returns an Option wrapped result.

myList reduceLeftOption {(a,b) => a+b} match {
  case None    => // handle no result as necessary
  case Some(v) => println(v)
}

The basic reason they are both in Scala standard library is probably because they are both in Haskell standard library (called foldl and foldl1). If reduceLeft wasn't, it would quite often be defined as a convenience method in different projects.


From Functional Programming Principles in Scala (Martin Odersky):

The function reduceLeft is defined in terms of a more general function, foldLeft.

foldLeft is like reduceLeft but takes an accumulator z, as an additional parameter, which is returned when foldLeft is called on an empty list:

(List (x1, ..., xn) foldLeft z)(op) = (...(z op x1) op ...) op x

[as opposed to reduceLeft, which throws an exception when called on an empty list.]

The course (see lecture 5.5) provides abstract definitions of these functions, which illustrates their differences, although they are very similar in their use of pattern matching and recursion.

abstract class List[T] { ...
  def reduceLeft(op: (T,T)=>T) : T = this match{
    case Nil     => throw new Error("Nil.reduceLeft")
    case x :: xs => (xs foldLeft x)(op)
  }
  def foldLeft[U](z: U)(op: (U,T)=>U): U = this match{
    case Nil     => z
    case x :: xs => (xs foldLeft op(z, x))(op)
  }
}

Note that foldLeft returns a value of type U, which is not necessarily the same type as List[T], but reduceLeft returns a value of the same type as the list).