Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Confused by "Init/Base" in foldr/foldl (Racket)

I am close to understanding foldr and foldl but not quite there yet.

I understand that foldr is basically the stack implementation of performing some function on a list from "right to left".

So for foldr:

(define (fn-for-lon lon)
    (cond [(empty? lon) 0]
          [else
             (+ (first lon)
                (fn-for-lon (rest lon))]))

Is basically the equivalent of:

(foldr + 0 lon)

And I understand that foldl is basically the tail recursive, accumulator version that goes from "left to right".

So for foldl:

(define (fn-for-lon lon0)
    (local [(define (fn-for-lon lon acc)
                (cond [(empty? lon) acc]
                      [else
                          (fn-for-lon (rest lon) (+ acc (first lon)))]
   (fn-for-lon lon0)))

Is basically the equivalent of:

(foldl + 0 lon)

But what happens once you introduce two variables? I've tried to read on the topic but I just can't grasp it because no one talks about what is happening behind the scenes.

I'm really confused to whether or not the second argument is "base" or "init" or if it just depends on whether or not one or two variables are taken in the function. In the foldl example I gave it would seem to be the init acc value (which I suppose ends up being the base case), but in the foldr it would be the base case. Is it just because I only use an operator for the proc?

(foldr (lambda (a b) (cons (add1 a) b)) empty (list 1 2 3 4))

Something like the above I really just lose all understanding. I understand what to do, but not really what is happening in the background. This causes me to lose my track in more complex problems.

Is this the exact same thing and b is just the empty? Does b temporarily take on the (rest lon)?

(define (fn-for-lon lon)
  (cond [(empty? lon) empty]
        [else
         (cons (add1 (first lon))
               (fn-for-lon (rest lon)))]))
like image 240
Adam Thompson Avatar asked Mar 16 '23 00:03

Adam Thompson


2 Answers

Let's take a look at the actual foldl from Racket:

 (define foldl
    (case-lambda
      [(f init l)
       (check-fold 'foldl f init l null)
       (let loop ([init init] [l l])
         (if (null? l) init (loop (f (car l) init) (cdr l))))]
      [(f init l . ls)
       (check-fold 'foldl f init l ls)
       (let loop ([init init] [ls (cons l ls)])
         (if (pair? (car ls)) ; `check-fold' ensures all lists have equal length
             (loop (apply f (mapadd car ls init)) (map cdr ls))
             init))]))

We see two cases: the first case (f init l) is there for efficiency. If only one list l is used, then we get a fast version of foldl.

The second case (f init l . ls) is the one you are after. Before we examine it, we need to look at the helper function mapadd first.

A call (mapadd f l last) will apply f to all elements of a list l and cons the result onto `last.

Example:

> (mapadd sqr '(1 2 3 4) 42)
'(1 4 9 16 42)

The definition of mapadd:

  (define (mapadd f l last)
    (let loop ([l l])
      (if (null? l)
        (list last)
        (cons (f (car l)) (loop (cdr l))))))

Back to the (f init l . ls) case of foldl.

Removing the error check it boils down to

       (let loop ([init init] 
                  [ls (cons l ls)])
         (if (pair? (car ls))
             (loop (apply f (mapadd car ls init)) 
                   (map cdr ls))
             init))]))

The initial value init is bound to a variable (also called) init which is used to accumulate the result. The variable ls is when the loop starts bound to a list of all lists foldl is called with. Note that these lists all have the same length. The loop continues until all the lists in ls are empty. The test (pair? (car ls)) checks whether the first list is empty, but remember that the lists have the same length.

Now init is replaced with (apply f (mapadd car ls init)). This call first takes the first element of each list and conses on onto the current value of init. Then f is applied.

Consider this example: (foldl + 0 '(1 2) '(10 11)) which evaluates to 24. Here

 f    = +
 init = 0
 ls   = ((1 2) (10 11))

And

 > (mapadd car '((1 2) (10 11)) 0)
 '(1 10 0)

so

 > (apply + (mapadd car '((1 2) (10 11)) 0))
 11

In the next round we will see

 f    = +
 init = 11
 ls   = ((2) (11))

And (apply + (mapadd car ls init) will evaluate to 24.

An alternative way to explain the example (foldl + 0 '(1 2) '(10 11)).

(define init 0)
(for ([x (in-list '( 1  2))]              ; x and y loop in parallel
      [y (in-list '(10 11))])
  (set! init (apply + (list x y init))))  ; accumulate result in init
init

The complication in the implementation of foldl is that one doesn't know how many lists are used.

UPDATE

When foldl is used in practise it is best to think of foldl in terms of its specification and not of its implementation.

Here is how foldl is specified when called with two lists.

The call:

  (foldl f x0
     (cons a0 (cons a1 (cons a2 '())))
     (cons b0 (cons b1 (cons b2 '()))))

evaluates to the same as

(f a2 b2
   (f a1 b1
     (f a0 b0 
        x0)))

does.

As a check we can try it out:

>   (foldl (λ args (cons 'f args)) 'x0
       (cons 'a0 (cons 'a1 (cons 'a2 '())))
       (cons 'b0 (cons 'b1 (cons 'b2 '()))))
'(f a2 b2 (f a1 b1 (f a0 b0 x0)))

Note that (λ args (cons 'f args)) is the function that just prepends a symbol f to its list of arguments.

like image 143
soegaard Avatar answered Mar 20 '23 01:03

soegaard


The second argument to foldl and foldr is always init. Your provided function must always take 2 arguments if you're passing in a single list, and the second argument to that function is the accumulated value (init initially, then the return value of the previous call to your function).

(In your earlier examples of using +, you can think of it as being the same as (lambda (a b) (+ a b)). In this case, a is the list element, and b is the accumulated value.)

When you're calling foldl or foldr with N lists, then your provided function must take N+1 arguments; the first N arguments correspond to the next element from each list, and the last argument is the accumulated value (init initially, then the return value of the previous call to your function).


Would it help your understanding if I provided (my own) implementations of foldl and foldr? Here they are:

(define foldl
  (case-lambda
    ;; one list
    [(func init lst)
     (let loop ([result init]
                [rest lst])
       (if (null? rest)
           result
           (loop (func (car rest) result) (cdr rest))))]

    ;; multiple lists
    [(func init list1 . list2+)
     (let loop ([result init]
                [rests (cons list1 list2+)])
       (if (ormap null? rests)
           result
           (loop (apply func (append (map car rests) (list result)))
                 (map cdr rests))))]))

(define foldr
  (case-lambda
    ;; one list
    [(func init lst)
     (let recur ([rest lst])
       (if (null? rest)
           init
           (func (car rest) (recur (cdr rest)))))]

    ;; multiple lists
    [(func init list1 . list2+)
     (let recur ([rests (cons list1 list2+)])
       (if (ormap null? rests)
           init
           (apply func (append (map car rests)
                               (list (recur (map cdr rests)))))))]))
like image 27
Chris Jester-Young Avatar answered Mar 20 '23 03:03

Chris Jester-Young