Is there a way to avoid copying the whole search path of a binary tree on insert?

Tags:

I've just started working my way through Okasaki's Purely Functional Data Structures, but have been doing things in Haskell rather than Standard ML. However, I've come across an early exercise (2.5) that's left me a bit stumped on how to do things in Haskell:

Inserting an existing element into a binary search tree copies the entire search path even though the copied nodes are indistinguishable from the originals. Rewrite insert using exceptions to avoid this copying. Establish only one handler per insertion rather than one handler per iteration.

Now, my understanding is that ML, being an impure language, gets by with a conventional approach to exception handling not so different to, say, Java's, so you can accomplish it something like this:

type Tree = E | T of Tree * int * Tree

exception ElementPresent

fun insert (x, t) = 
  let fun go E = T (E, x, E)
      fun go T(l, y, r) = 
             if      x < y then T(go (l), x, r)
             else if y < x then T(l, x, go (r))
             else    raise ElementPresent
  in go t
  end 
  handle ElementPresent => t

I don't have an ML implementation, so this may not be quite right in terms of the syntax.

My issue is that I have no idea how this can be done in Haskell, outside of doing everything in the IO monad, which seems like cheating and even if it's not cheating, would seriously limit the usefulness of a function which really doesn't do any mutation. I could use the Maybe monad:

data Tree a = Empty | Fork (Tree a) a (Tree a)
        deriving (Show)

insert     :: (Ord a) => a -> Tree a -> Tree a
insert x t = maybe t id (go t)
  where go Empty   = return (Fork Empty x Empty)
    go (Fork l y r)
      | x < y     = do l' <- go l; return (Fork l' y r)
      | x > y     = do r' <- go r; return (Fork l y r')
      | otherwise = Nothing

This means everything winds up wrapped in Just on the way back up when the element isn't found, which requires more heap allocation, and sort of defeats the purpose. Is this allocation just the price of purity?

EDIT to add: A lot of why I'm wondering about the suitability of the Maybe solution is that the optimization described only seems to save you all the constructor calls you would need in the case where the element already exists, which means heap allocations proportional to the length of the search path. The Maybe also avoids those constructor calls when the element already exists, but then you get a number of Just constructor calls equal to the length of the search path. I understand that a sufficiently smart compiler could elide all the Just allocations, but I don't know if, say, the current version of GHC is really that smart.

328

asked May 22 '14 13:05

Pillsy

1 Answers

In terms of cost, the ML version is actually very similar to your Haskell version.

Every recursive call in the ML version results in a stack frame. The same is true in the Haskell version. This is going to be proportional in size to the path that you traverse in the tree. Also, both versions will of course allocate new nodes for the entire path if an insertion is actually performed.

In your Haskell version, every recursive call might also eventually result in the allocation of a Just node. This will go on the minor heap, which is just a block of memory with a bump pointer. For all practical purposes, GHC's minor heap is roughly equivalent in cost to the stack. Since these are short-lived allocations, they won't normally end up being moved to the major heap at all.

102

answered Oct 12 '22 22:10

Jake McArthur

Related questions
                            
                                Mark block based on indentation level in Vim
                            
                                How can I load a runhaskell script without a .hs extension with ghci?
                            
                                converting a String to type UTCTime
                            
                                How to create JSON Rest API with Happstack? JSON body?
                            
                                Connect to Unix domain socket as client in Haskel
                            
                                Haskell Could not find module `Network.HTTP.Conduit'
                            
                                How does EitherT work?
                            
                                Stuck in the State Monad
                            
                                Simpler alternative libs to Reactive? (Haskell)
                            
                                What language extensions does the MTL library require?
                            
                                How efficient can Haskell state be compared to C++, for very stateful games/simulations?
                            
                                Are TChan writes integrated into Haskell STM?
                            
                                Long polling in Yesod
                            
                                How does scanr work? Haskell
                            
                                Pattern matching in do notation vs let
                            
                                Why Haskell's Data.List.deleteBy takes in input a comparison function (a -> a -> Bool) and a value instead of a predicate (a -> Bool)?
                            
                                How to combine lenses in "parallel"
                            
                                Cabal and --no-require-sandbox
                            
                                Haskell notation for composing two functions f and g where g takes multiple arguments
                            
                                Error while creating test suites: "cannot satisfy -package-id"

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Is there a way to avoid copying the whole search path of a binary tree on insert?

Tags:

functional-programming

haskell

binary-tree

Pillsy

People also ask

1 Answers

Jake McArthur

Recent Activity

Donate For Us