Learn You a Haskell mentions difference lists (search for this term on that page), where a list l
is represented not directly but as a function (l++)
. This allows more efficient concatenation on both left and right. Concatenation becomes function composition, and one can finally convert to a real list by ($[])
. I was wondering what operations one can support efficiently on difference lists. For example, the equivalent of (:)
for difference lists is
\x l -> (x:) . l
Can one efficiently implement head
and tail
for difference lists? Here is the obvious implementation:
headTailDifList :: ([a] -> [a]) -> (a, [a] -> [a])
headTailDifList dl = (head l, ((tail l)++))
where
l = dl []
For real lists, \l -> (head l, tail l)
runs in constant time. What about this headTailDifList
? Perhaps due to lazy evaluation only the first element of l
will be evaluated?
headTailDifList
run in constant time?Is there some other constant-time implementation? Here's a candidate:
headTailDifList dl = (head (dl []), tail.dl )
However, the tail part does not throw an exception if dl
is id
(the empty difference list).
Edit: added more about the thunk.
Begin by looking just at the conversion from a difference list to a regular list. This operation alone takes only constant time, because no evaluation is required. The resulting list will exist as a thunk which will be structured something like this:
The base definition of (++)
is right-associative and needs to step through the entire first argument before it can continue with the second argument. This matches perfectly with the nested structure produced by the difference list, as each (++)
gets a single list chunk as its first argument. Furthermore, because (++)
is a good list producer, the entire structure exists lazily. Although fully evaluating it takes O(n), evaluating just the head is O(k) where k=number of chunks
.
Consider if a,b == []; c = [1..]
. Then head
would need to first check that a
and b
are empty before moving on to c
and finding the first element. In the worst case head
would traverse the entire structure before finding the empty list constructor. Practically this is a very rare case however, and even then it's not particularly harmful because encountering an empty chunk and moving on is a constant-time operation.
In general use, evaluating a difference list should differ from a regular list in time complexity by only a constant factor for equivalent operations.
Producing just the first element of a difference list doesn't require O(n) time, as can easily be demonstrated by using infinite lists:
*Dl Control.Monad Control.Applicative> head $ ($ []) $ toDl [1..] . toDl [4..]
1
(0.01 secs, 2110104 bytes)
*Dl Control.Monad Control.Applicative> fmap (head . ($ [])) $ headTailDifList (toDl [1..])
(1,2)
(0.00 secs, 2111064 bytes)
*Dl Control.Monad Control.Applicative> fmap (head . ($ [])) $ headTailDifList (toDl [1..] . toDl [3..] . toDl [] . toDl [5..])
(1,2)
(0.01 secs, 3105792 bytes)
-- create a difference list
toDl :: [a] -> ([a] -> [a])
toDl l = (l ++)
Have a look at the dlist package, which provides a typical implementation of difference lists. It defines a type DList
:
newtype DList a = DL { unDL :: [a] -> [a] }
and all common list functions, including head
and tail
:
head :: DList a -> a
head = list (error "Data.DList.head: empty list") const
tail :: DList a -> DList a
tail = list (error "Data.DList.tail: empty list") (flip const)
Here, list
is defined as:
list :: b -> (a -> DList a -> b) -> DList a -> b
list nill consit dl =
case toList dl of
[] -> nill
(x : xs) -> consit x (fromList xs)
with:
fromList :: [a] -> DList a
fromList = DL . (++)
toList :: DList a -> [a]
toList = ($[]) . unDL
That is, list
takes O(n) time to produce all elements. As pointed out by JohnL, just producing the first element may be done in constant time, though.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With