How do inits and tails work in Data.Sequence?

Question

Louis Wasserman wrote the current implementations of inits and tails in Data.Sequence. He indicates that they are very efficient, and indeed just looking at the code I can see that whatever they are doing, they are doing it in the clean, top-down sort of fashion that tends to lead to good performance for lazy trees. Unfortunately, I can't actually make heads or tails of just what they are doing. Can anyone give me a hand? The code is a bit long, but can be found on Hackage.

Louis Wasserman · Accepted Answer

It's me!

I think the best approach is probably to work through an example. Let's have a go...

Deep (Two 1 2)                                                    (Two 7 8))
                (Deep (One (Node2 3 4))        (One (Node2 5 6))
                                         Empty

Here's a Sequence, somewhat simplified (e.g. omitting the Elem wrappers).

Let's do inits on this; tails is essentially symmetric. Our recursive algorithm is going to leave out the empty init and only include nonempty things.

Prefix

So the prefix digit of inits is going to be generated with, essentially, fmap digitToTree (initsDigit (Two 1 2)).

initsDigit (Two 1 2) = Two (One 1) (Two 1 2)
fmap digitToTree (Two (One 1) (Two 1 2)) = 
    Two (Single 1) (Deep (One 1) Empty (One 2))

So those are the first two inits of the whole thing, and this digit is going to be the prefix digit of the result of inits. (Except we're going to prepend the empty sequence after all we're done, but let's ignore that for now.)

Inner tree

Now let's take the inits of the inner tree, treated as a FingerTree (Node a) -- we're not going to pull apart the nodes yet, it's just a two-element FingerTree containing two nodes. I'm not going to do the details of this, it's just recursing through the same algorithm, I'm just going to magically reach the result

Deep 
    (One (Single (Node2 3 4))) 
    Empty 
    (One (Deep (One (Node2 3 4)) Empty (One (Node2 5 6))))
  :: FingerTree (FingerTree (Node a))

So these are the inits of the inner tree. How do these correspond to inits of the outer tree? Each init of the inner tree corresponds to a tree containing

the prefix digit of the original tree, Two 1 2
all but the last Node of the init of the inner tree
some prefix of the last Node of the init of the inner tree

So each FingerTree (Node a) acquired by taking an init of the inner tree will map to a Node (FingerTree a), containing a FingerTree for each init of the last node in the FingerTree (Node a).

So for example, Single (Node2 3 4), with its last node extracted, will be decomposed to Empty and Node2 3 4, and the resulting Node (FingerTree a) is

Node2 
   (Deep (Two 1 2 {- prefix of original tree -}) 
         Empty 
         (One 3 {- first prefix of Node2 3 4 -}))
   (Deep (Two 1 2) 
         Empty 
         (Two 3 4 {- second prefix of Node2 3 4 -}))

And for the other prefix of the inner tree, Deep (One (Node2 3 4)) Empty (One (Node2 5 6)), extracting the last Node gives us the remainder Single (Node2 3 4) and the extracted node Node2 5 6, so the resulting Node (FingerTree a):

Node2
   (Deep (Two 1 2 {- prefix of original tree -})
         (Single (Node2 3 4) {- init of the inner tree minus the last Node -})
         (One 5 {- first prefix of Node2 5 6 -})
   (Deep (Two 1 2 {- prefix of original tree -})
         (Single (Node2 3 4) {- init of the inner tree minus the last Node -})
         (Two 5 6 {- second prefix of Node2 5 6 -}))

So this is an operation that takes a FingerTree (Node a), a single init of the inner tree, to a Node (FingerTree a). So, having recursively acquired the inits of the inner tree as a FingerTree (FingerTree (Node a)), we map this function over them to get a FingerTree (Node (FingerTree a)), which is exactly what we wanted; it's the inner tree of the inits of the entire thing.

Suffix

Finally, there are inits of the original tree composed of

the original prefix
the original inner tree
each init of the suffix of the original tree

and these become the suffix digit of the tree of inits. initsDigit (Two 7 8) returns Two (One 7) (Two 7 8), and we essentially just map \sf -> Deep pr m sf over this, to get

Two 
   (Deep (Two 1 2 {- original -})
         (Deep (One (Node2 3 4)) Empty (One (Node2 5 6)) {- original -})
         (One 7 {- first init of original suffix digit -}))
   (Deep (Two 1 2 {- original -})
         (Deep (One (Node2 3 4)) Empty (One (Node2 5 6)) {- original -})
         (Two 7 8 {- second init of original suffix digit -}))

So, this isn't quite how the code is organized. We've described a function from FingerTree a to FingerTree (FingerTree a), but the actual implementation is essentially that plus an fmap, because we end up always needing to map the elements somehow -- even if it's just wrapping newtypes. But this is basically what we're doing.

How do inits and tails work in Data.Sequence?

Tags:

data-structures

haskell

sequence

dfeuer

Video Answer

1 Answers

Louis Wasserman

Recent Activity

Donate For Us

How do inits and tails work in Data.Sequence?

Tags:

data-structures

haskell

sequence

dfeuer

Video Answer

1 Answers

Louis Wasserman

Related questions

Recent Activity

Donate For Us