data Tree a = Tree a [Tree a]
Note that we do not allow empty trees, and that a leaf is a tree with an empty list of subtrees.
treeFold :: (a -> [b] -> b) -> Tree a -> b
treeFold f (Tree x s) = f x (map (treeFold f) s)
Given the above information, I don't understand how the tree fold function returns a result by recursively applying the fold operation to the subtrees, then applying the function to the label at the root and the results returned from the subtrees.
I also do not get how the tree Fold function only takes one argument instead of 2, when it is passed as argument to the map function and it still compiles and runs properly.
For example, the tree size function below, counts the nodes of the tree.
treeSize :: Tree a -> Int
treeSize = treeFold (\x ys -> 1 + sum ys)
So running treeSize tree where tree = Tree 4 [Tree 1 [Tree 2 [], Tree 3 []]]
gives the size of the tree as 4.
In the tree size function above, the tree fold function is also passed one argument instead of two. Also, the x that is passed to the tree fold function is not getting used anywhere, so why do you need it there. Removing it causes the program to not compile and it gives the following error message.
Couldn't match type `a' with `[[Int] -> Int]'
`a' is a rigid type variable bound by
the type signature for treeSize :: Tree a -> Int
at treeFold.hs:15:1
In the first argument of `sum', namely `ys'
In the second argument of `(+)', namely `sum ys'
In the expression: 1 + sum ys
Any help would be greatly appreciated.
A fold takes a binary function, a starting value (I like to call it the accumulator) and a list to fold up. The binary function itself takes two parameters. The binary function is called with the accumulator and the first (or last) element and produces a new accumulator.
Haskell, in turn, has two fundamental functions representing reducing, or, as we call it, folding – foldl and foldr – that differ in the order of the folding. foldl reduces elements of a container from left to right (as reduce in other languages usually does), while foldr reduces from right to left.
In functional programming, fold (also termed reduce, accumulate, aggregate, compress, or inject) refers to a family of higher-order functions that analyze a recursive data structure and through use of a given combining operation, recombine the results of recursively processing its constituent parts, building up a ...
First, the way you explicitly mark a variable as being unused is to replace the variable with _
. So you really wanted:
treeFold (\_ ys -> 1 + sum ys)
You got a compiler error because you wrote:
treeFold (\ys -> 1 + sum ys)
... which is not the same thing.
Second, I'll hand evaluate treeSize
on an example tree so you can see that there is no magic going on:
treeSize (Tree 1 [Tree 2 [], Tree 3 []])
-- Inline definition of 'treeSize'
= treeFold (\_ ys -> 1 + sum ys) (Tree 1 [Tree 2 [], Tree 3 []])
-- Evaluate treeFold
= (\_ ys -> 1 + sum ys) 1 (map (treeFold (\_ ys -> 1 + sum ys)) [Tree 2 [], Tree 3 []])
-- Apply the anonymous function
= 1 + sum (map (treeFold (\_ ys -> 1 + sum ys)) [Tree 2 [], Tree 3 []])
-- Apply the 'map' function
= 1 + sum [ treeFold (\_ ys -> 1 + sum ys) (Tree 2 [])
, treeFold (\_ ys -> 1 + sum ys) (Tree 3 [])
]
-- Apply both 'treeFold' functions
= 1 + sum [ (\_ ys -> 1 + sum ys) 2 (map (treeFold (\_ ys -> 1 + sum ys)) [])
, (\_ ys -> 1 + sum ys) 3 (map (treeFold (\_ ys -> 1 + sum ys)) [])
]
-- Apply the anonymous functions
= 1 + sum [ 1 + sum (map (treeFold (\_ ys -> 1 + sum ys)) [])
, 1 + sum (map (treeFold (\_ ys -> 1 + sum ys)) [])
]
-- map f [] = []
= 1 + sum [ 1 + sum []
, 1 + sum []
]
-- sum [] = 0
= 1 + sum [1 + 0, 1 + 0]
= 1 + sum [1, 1]
-- Apply 'sum'
= 1 + 2
= 3
However, there's an easy way to remember how treeFold
works. All it does it replace each Tree
constructor with the function that you supply it with.
So if you have:
Tree 1 [Tree 2 [Tree 3 [], Tree 4[]], Tree 5 [Tree 6 [], Tree 7 []]]
... and you apply treeFold f
to that, it will evaluate to:
f 1 [f 2 [f 3 [], f 4 []], f 5 [f 6 [], f 7 []]]
treeSum
is just the special case where f = (\_ ys -> 1 + sum ys)
:
1 + sum [1 + sum [1 + sum [], 1 + sum []], 1 + sum [1 + sum [], 1 + sum []]]
= 7
The final point is how currying works in Haskell. When you define a function like:
foo x y = x + y
... the compiler actually desugars that to two nested functions:
foo = \x -> (\y -> x + y)
This is why you can partially apply functions to just one argument in Haskell. When you write foo 1
, it translates to:
foo 1
= (\x -> (\y -> x + y)) 1
= \y -> 1 + y
In other words, it returns a new function of one argument.
In Haskell, all functions take exactly one argument, and we simulate functions of multiple arguments by returning new functions. This technique is known as "currying".
If you prefer the multi-argument approach of more traditional mainstream languages, you can simulate it by having the function accept a tuple argument:
f (x, y) = x + y
However, that's not really idiomatic Haskell, and it won't give you any sort of performance improvement.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With