This submission to Programming Praxis gives an O(n) function that "undoes" a preorder traversal of a binary search tree, converting a list back into a tree. Supplying the missing data declaration:
data Tree a = Leaf | Branch {value::a, left::Tree a, right:: Tree a} deriving (Eq, Show) fromPreOrder :: Ord a => [a] -> Tree a fromPreOrder [] = Leaf fromPreOrder (a:as) = Branch a l (fromPreOrder bs) where (l,bs) = lessThan a as lessThan n [] = (Leaf,[]) lessThan n all@(a:as) | a >= n = (Leaf,all) | otherwise = (Branch a l r,cs) where (l,bs) = lessThan a as (r,cs) = lessThan n bs
It's obvious that one constructor is added to the tree in each recursive step, which is key to its efficiency.
The only "problem" is that the list is threaded through the computation manually, which is not a terribly Haskellian way to do it and makes it a little harder to see that it is actually consumed element by element in a single-threaded manner.
I attempted to correct this using a state monad (prettified on Codepad):
import Control.Monad.State data Tree a = Leaf | Branch {root::a, left::Tree a, right::Tree a} deriving (Eq,Show) peek = State peek' where peek' [] = (Nothing,[]) peek' a@(x:_) = (Just x,a) pop = State pop' where pop' [] = error "Tried to read past the end of the list" pop' (_:xs) = ((),xs) prebuild'::Ord a => State [a] (Tree a) prebuild' = do next <- peek case next of Nothing -> return Leaf Just x -> do pop leftpart <- lessThan x rightpart <- prebuild' return (Branch x leftpart rightpart) lessThan n = do next <- peek case next of Nothing -> return Leaf Just x -> do if x < n then do pop leftpart <- lessThan x rightpart <- lessThan n return (Branch x leftpart rightpart) else return Leaf prebuild::Ord a => [a] -> Tree a prebuild = evalState prebuild'
Unfortunately, this just looks obscenely messy, and doesn't seem any easier to reason about.
One thought I haven't been able to get anywhere with yet (in part because I don't have a deep enough understanding of the underlying concepts, quite likely): could I use a left fold over the list that builds a continuation that ultimately produces the tree? Would that be possible? Also, would it be anything short of insane?
Another thought was to write this as a tree unfold, but I don't think it's possible to do that efficiently; the list will end up being traversed too many times and the program will be O(n^2).
Taking things from another direction, I have the nagging suspicion that it might be possible to come up with an algorithm that starts by splitting up the list into increasing segments and decreasing segments, but I haven't yet found something concrete to do with that idea.
I think the problem you're having with State
is that your primitives (push
, pop
, peek
) are not the right ones. I think a better one would be something like available_
, which checks if the front of the stack matches a particular condition, and executes something different in each case:
available_ p f m = do
s <- get
case s of
x:xs | p x -> put xs >> f x
_ -> m
Actually, in our use case, we can specialize a bit: we will always want to return a Leaf
when the head of our stack doesn't satisfy the condition, and we'll always want to recurse when it does.
available p m = available_ p
(\x -> liftM2 (Branch x) (lessThan' x) m)
(return Leaf)
(You could also just write available
to begin with and skip available_
entirely. In my first iteration, that is what I did.) Now writing fromPreOrder
and lessThan
are a snap, and also I think give some insight into their behavior. I'll name them with primes so we can double-check they do the right thing with QuickCheck.
fromPreOrder' = available (const True) fromPreOrder'
lessThan' n = available (<n) (lessThan' n)
And in ghci:
> quickCheck (\xs -> fromPreOrder (xs :: [Int]) == evalState fromPreOrder' xs)
+++ OK, passed 100 tests.
While I can't answer the question about continuation passing, I believe that the State
monad based implementation can be written much more clearly. First, we can use notational convenience such as those from Control.Applicative
to make it easier to read. Second, we can upgrade the effect stack to include Maybe
in order to capture the notion of failure (a) from taking the head of an empty list and (b) from the (a >= n)
comparison as an effect.
import Control.Monad.State
import Control.Applicative
The final code uses the backtracking-state monad transformer stack. This means that we wrap State
around Maybe
instead of Maybe
around State
. In some sense we can think of this as meaning that failure is the "primary" effect. In practice it means that if the algorithm fails there's no way to continue using potentially bad state and so it must backtrack to the last known good state.
type Preord a b = StateT [a] Maybe b
Since we keep taking the head of a list and want to capture that failure correctly, we'll use a "safe head" function (which is the natural destructor of a list anyway, despite not being in the base Haskell libraries)
-- Safe list destructor
uncons :: [a] -> Maybe (a, [a])
uncons [] = Nothing
uncons (a:as) = Just (a, as)
If we look at it cleverly we'll notice that this is already exactly the form of our monadic computation (StateT [a] Maybe b
is isomorphic to [a] -> Maybe (b, [a])
). We'll give it a more evocative name when lifted into the Monad
.
-- Try to get the head or fail
getHead :: Preord a a
getHead = StateT uncons
A common feature of this algorithm is stopping local failures by providing a default value. I'll capture this in the certain
combinator
-- Provides a default value for a failing computation
certain :: b -> Preord a b -> Preord a b
certain def p = p <|> return def
And now we can write the final algorithm very cleanly in our Preord
monad.
fromPreOrder :: Ord a => Preord a (Tree a)
fromPreOrder = certain Leaf $ do
a <- getHead
Branch a <$> lessThan a <*> fromPreOrder
lessThan :: Ord a => a -> Preord a (Tree a)
lessThan n = certain Leaf $ do
a <- getHead
guard (a < n)
Branch a <$> lessThan a <*> lessThan n
Note that Applicative
style helps to indicate that we're building the components of the Branch
constructor using further effectful (state consuming) computations. The guard
short-circuits lessThan
when the pivot is already the least element in the pre-order traversal. We also explicitly see how both fromPreOrder
and lessThan
default out to Leaf
when they cannot compute a better result.
(Also note that fromPreOrder
and lessThan
are nearly identical now, a commonality Daniel Wagner exploited in his own answer when writing available
.)
We finally would want to hide all the monadic noise since, to an outside user, this is just a pure algorithm.
rebuildTree :: [a] -> Tree a
rebuildTree = fromMaybe Leaf . runStateT fromPreOrder
For a complete picture, here's the implementation of the algorithm using only the State
monad. Note all the extra noise for handling failure! We've absorbed the entire popElse
function into the effects of the backtracking state monad. We also lift the if
up into the failure effect. Without that effect stack, our combinators are terrifically specific to the application instead of decomplected and useful elsewhere.
-- Try to take the head of the state list and return the default
-- if that's not possible.
popElse :: b -> (a -> State [a] b) -> State [a] b
popElse def go = do
x <- get
case x of
[] -> return def
(a:as) -> put as >> go a
push :: a -> State [a] ()
push a = modify (a:)
fromPreOrder :: Ord a => State [a] (Tree a)
fromPreOrder = popElse Leaf $ \a -> Branch a <$> lessThan a <*> fromPreOrder
lessThan :: Ord a => a -> State [a] (Tree a)
lessThan n =
popElse Leaf $ \a ->
if a >= n
then push a >> return Leaf
else Branch a <$> lessThan a <*> lessThan n
As you've said, the state monad doesn't really improve the situation, and I don't think it can be expected to, as it's both much too general in that it allows arbitrary access to the state, and annoying in that it enforces unnecessary sequencing.
At first glance, this looks quite like a foldr
: we do one thing for the empty case, and in the (:)
case we take the head off and make a recursive call based on the tail. However, as the recursive call isn't just using the tail directly, it isn't quite a foldr
.
We could express it as a paramorphism but I don't think that really adds anything to the readability.
What I did notice is that the complicated recursion on the tail is all based on lessThan
, which led me to the following idea for breaking down the algorithm:
lessThans [] = []
lessThans (a:as) = (a, l) : lessThans bs
where (l, bs) = lessThan a as
fromPreOrder2 :: Ord a => [a] -> Tree a
fromPreOrder2 = foldr (\(a, l) r -> Branch a l r) Leaf . lessThans
I'm sure lessThans
could have a better name but I'm not quite sure what!
The foldr
can also be expressed as foldr (uncurry Branch) Leaf
but I'm not sure if that's an improvement.
EDIT: also, lessThans
is an unfoldr
, leading to this version:
fromPreOrder3 :: Ord a => [a] -> Tree a
fromPreOrder3 = foldr (uncurry Branch) Leaf . unfoldr lessThanList
lessThanList [] = Nothing
lessThanList (a:as) = Just ((a, l), bs)
where (l, bs) = lessThan a as
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With