I am trying to construct a lazy data structure that holds an infinite bitmap. I would like to support the following operations:
true :: InfBitMap
Returns an infinite bitmap of True, i.e. all positions should have value True.
falsify :: InfBitMap -> [Int] -> InfBitMap
Set all positions in the list to False. The list is possible infinite. For example, falsify true [0,2..] will return a list where all (and only) odd positions are True.
check :: InfBitMap -> Int -> Bool
Check the value of the index.
Here is what I could do so far.
-- InfBitMap will look like [(@), (@, @), (@, @, @, @)..]
type InfBitMap = [Seq Bool]
true :: InfBitMap
true = iterate (\x -> x >< x) $ singleton True
-- O(L * log N) where N is the biggest index in the list checked for later
-- and L is the length of the index list. It is assumed that the list is
-- sorted and unique.
falsify :: InfBitMap -> [Int] -> InfBitMap
falsify ls is = map (falsify' is) ls
where
-- Update each sequence with all indices within its length
-- Basically composes a list of (update pos False) for all positions
-- within the length of the sequence and then applies it.
falsify' is l = foldl' (.) id
(map ((flip update) False)
(takeWhile (< length l) is))
$ l
-- O(log N) where N is the index.
check :: InfBitMap -> Int -> Bool
check ls i = index (fromJust $ find ((> i) . length) ls) i
I am wondering if there is some Haskellish concept/data-structure that I am missing that would make my code more elegant / more efficient (constants do not matter to me, just order). I tried looking at Zippers and Lenses but they do not seem to help. I would like to keep the complexities of updates and checks logarithmic (maybe just amortized logarithmic).
Note: before someone suspects it, no this is not a homework problem!
Update:
It just occurred to me that check can be improved to:
-- O(log N) where N is the index.
-- Returns "collapsed" bitmap for later more efficient checks.
check :: InfBitMap -> Int -> (Bool, InfBitMap)
check ls i = (index l i, ls')
where
ls'@(l:_) = dropWhile ((<= i) . length) ls
Which can be turned into a Monad for code cleanliness.
A slight variation on the well-known integer trie seems to be applicable here.
{-# LANGUAGE DeriveFunctor #-}
data Trie a = Trie a (Trie a) (Trie a) deriving (Functor)
true :: Trie Bool
true = Trie True true true
-- O(log(index))
check :: Trie a -> Int -> a
check t i | i < 0 = error "negative index"
check t i = go t (i + 1) where
go (Trie a _ _) 1 = a
go (Trie _ l r) i = go (if even i then l else r) (div i 2)
--O(log(index))
modify :: Trie a -> Int -> (a -> a) -> Trie a
modify t i f | i < 0 = error "negative index"
modify t i f = go t (i + 1) where
go (Trie a l r) 1 = Trie (f a) l r
go (Trie a l r) i | even i = Trie a (go l (div i 2)) r
go (Trie a l r) i = Trie a l (go r (div i 2))
Unfortunately we can't use modify
to implement falsify
because we can't handle infinite lists of indices that way (all modifications have to be performed before an element of the trie can be inspected). Instead, we should do something more like a merge:
ascIndexModify :: Trie a -> [(Int, a -> a)] -> Trie a
ascIndexModify t is = go 1 t is where
go _ t [] = t
go i t@(Trie a l r) ((i', f):is) = case compare i (i' + 1) of
LT -> Trie a (go (2*i) l ((i', f):is)) (go (2*i+1) r ((i', f):is))
GT -> go i t is
EQ -> Trie (f a) (go (2*i) l is) (go (2*i+1) r is)
falsify :: Trie Bool -> [Int] -> Trie Bool
falsify t is = ascIndexModify t [(i, const False) | i <- is]
We assume strictly ascending indices in is
, since otherwise we would skip places in the trie or even get non-termination, for example in check (falsify t (repeat 0)) 1
.
The time complexities are a bit complicated by laziness. In check (falsify t is) index
, we pay an additional cost of a constant log 2 index
number of comparisons, and a further length (filter (<index) is)
number of comparisons (i. e. the cost of stepping over all indices smaller than what we're looking up). You could say it's O(max(log(index), length(filter (<index) is))
. Anyway, it's definitely better than the O(length is * log (index))
that we would get for a falsify
implemented for finite is
-es using modify
.
We must keep in mind that tree nodes are evaluated once, and subsequent check
-s for the same index after the first check
are not paying any extra cost for falsify
. Again, laziness makes this a bit complicated.
This falsify
is also pretty well-behaved when we want to traverse a prefix of a trie. Take this toList
function:
trieToList :: Trie a -> [a]
trieToList t = go [t] where
go ts = [a | Trie a _ _ <- ts]
++ go (do {Trie _ l r <- ts; [l, r]})
It's a standard breadth-first traversal, in linear time. The traversal time remains linear when we compute take n $ trieToList (falsify t is)
, since falsify
incurs at most n + length (filter (<n) is)
extra comparisons, which is at most 2 * n
, assuming strictly increasing is
.
(side note: the space requirement of breadth-first traversal is rather painful, but I can't see a simple way to help it, since iterative deepening is even worse here, because there the whole tree must be held in memory, while bfs only has to remember the bottom level of the tree).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With