(Inspired by my answer to this question.)
Consider this code (it's supposed to find the largest element that's less than or equal to a given input):
data TreeMap v = Leaf | Node Integer v (TreeMap v) (TreeMap v) deriving (Show, Read, Eq, Ord)
closestLess :: Integer -> TreeMap v -> Maybe (Integer, v)
closestLess i = precise Nothing where
precise :: Maybe (Integer, v) -> TreeMap v -> Maybe (Integer, v)
precise closestSoFar Leaf = closestSoFar
precise closestSoFar (Node k v l r) = case i `compare` k of
LT -> precise closestSoFar l
EQ -> Just (k, v)
GT -> precise (Just (k, v)) r
This isn't very lazy. Once the GT
case is entered, we know for sure that the final return value will be Just
something rather than Nothing
, but the Just
still isn't available until the end. I'd like to make this lazier so that the Just
is available as soon as the GT
case is entered. My test case for this is that I want Data.Maybe.isJust $ closestLess 5 (Node 3 () Leaf undefined)
to evaluate to True
rather than bottoming. Here's one way I can think to do this:
data TreeMap v = Leaf | Node Integer v (TreeMap v) (TreeMap v) deriving (Show, Read, Eq, Ord)
closestLess :: Integer -> TreeMap v -> Maybe (Integer, v)
closestLess _ Leaf = Nothing
closestLess i (Node k v l r) = case i `compare` k of
LT -> closestLess i l
EQ -> Just (k, v)
GT -> Just (precise (k, v) r)
where
precise :: (Integer, v) -> TreeMap v -> (Integer, v)
precise closestSoFar Leaf = closestSoFar
precise closestSoFar (Node k v l r) = case i `compare` k of
LT -> precise closestSoFar l
EQ -> (k, v)
GT -> precise (k, v) r
However, I'm now repeating myself: the core logic is now in both closestLess
and in precise
. How can I write this so that it's lazy but without repeating myself?
Rather than use explicit wrappers, you can leverage the type system. Note that the version of precise
that uses Maybe
for your first code snippet:
precise :: Maybe (Integer, v) -> TreeMap v -> Maybe (Integer, v)
precise closestSoFar Leaf = closestSoFar
precise closestSoFar (Node k v l r) = case i `compare` k of
LT -> precise closestSoFar l
EQ -> Just (k, v)
GT -> precise (Just (k, v)) r
is almost exactly the same algorithm as the version of precise
without Maybe
from your second code snippet, which could be written in the Identity
functor as:
precise :: Identity (Integer, v) -> TreeMap v -> Identity (Integer, v)
precise closestSoFar Leaf = closestSoFar
precise closestSoFar (Node k v l r) = case i `compare` k of
LT -> precise closestSoFar l
EQ -> Identity (k, v)
GT -> precise (Identity (k, v)) r
These can be unified into a version polymorphic in the Applicative
:
precise :: (Applicative f) => f (Integer, v) -> TreeMap v -> f (Integer, v)
precise closestSoFar Leaf = closestSoFar
precise closestSoFar (Node k v l r) = case i `compare` k of
LT -> precise closestSoFar l
EQ -> pure (k, v)
GT -> precise (pure (k, v)) r
By itself, that doesn't accomplish much, but if we know that the GT
branch will always return a value, we can force it to run in the Identity
functor, regardless of the starting functor. That is, we can start in the Maybe
functor but recurse into the Identity
functor in the GT
branch:
closestLess :: Integer -> TreeMap v -> Maybe (Integer, v)
closestLess i = precise Nothing
where
precise :: (Applicative t) => t (Integer, v) -> TreeMap v -> t (Integer, v)
precise closestSoFar Leaf = closestSoFar
precise closestSoFar (Node k v l r) = case i `compare` k of
LT -> precise closestSoFar l
EQ -> pure (k, v)
GT -> pure . runIdentity $ precise (Identity (k, v)) r
This works fine with your test case:
> isJust $ closestLess 5 (Node 3 () Leaf undefined)
True
and is a nice example of polymorphic recursion.
Another nice thing about this approach from a performance point of view is that the -ddump-simpl
shows that there are no wrappers or dictionaries. It's all been erased at the type level with specialized functions for the two functors:
closestLess
= \ @ v i eta ->
letrec {
$sprecise
$sprecise
= \ @ v1 closestSoFar ds ->
case ds of {
Leaf -> closestSoFar;
Node k v2 l r ->
case compareInteger i k of {
LT -> $sprecise closestSoFar l;
EQ -> (k, v2) `cast` <Co:5>;
GT -> $sprecise ((k, v2) `cast` <Co:5>) r
}
}; } in
letrec {
$sprecise1
$sprecise1
= \ @ v1 closestSoFar ds ->
case ds of {
Leaf -> closestSoFar;
Node k v2 l r ->
case compareInteger i k of {
LT -> $sprecise1 closestSoFar l;
EQ -> Just (k, v2);
GT -> Just (($sprecise ((k, v2) `cast` <Co:5>) r) `cast` <Co:4>)
}
}; } in
$sprecise1 Nothing eta
I think the CPS version you answered with yourself is the best but for completeness here are a few more ideas. (EDIT: Buhr's answer is now the most performant.)
The first idea is to get rid of the "closestSoFar
" accumulator, and instead let the GT
case handle all the logic of picking the rightmost value smallest than the argument. In this form, the GT
case can directly return a Just
:
closestLess1 :: Integer -> TreeMap v -> Maybe (Integer, v)
closestLess1 _ Leaf = Nothing
closestLess1 i (Node k v l r) =
case i `compare` k of
LT -> closestLess1 i l
EQ -> Just (k, v)
GT -> Just (fromMaybe (k, v) (closestLess1 i r))
This is simpler, but takes a bit more space on the stack when you hit a lot of GT
cases. Technically you could even use that fromMaybe
in the accumulator form (i.e., replacing the fromJust
implicit in luqui's answer), but that would be a redundant, unreachable branch.
The other idea that there's really two "phases" of the algorithm, one before and one after you hit a GT
, so you parameterize it by a boolean to represent these two phases, and use dependent types to encode the invariant that there will always be a result in the second phase.
data SBool (b :: Bool) where
STrue :: SBool 'True
SFalse :: SBool 'False
type family MaybeUnless (b :: Bool) a where
MaybeUnless 'True a = a
MaybeUnless 'False a = Maybe a
ret :: SBool b -> a -> MaybeUnless b a
ret SFalse = Just
ret STrue = id
closestLess2 :: Integer -> TreeMap v -> Maybe (Integer, v)
closestLess2 i = precise SFalse Nothing where
precise :: SBool b -> MaybeUnless b (Integer, v) -> TreeMap v -> MaybeUnless b (Integer, v)
precise _ closestSoFar Leaf = closestSoFar
precise b closestSoFar (Node k v l r) = case i `compare` k of
LT -> precise b closestSoFar l
EQ -> ret b (k, v)
GT -> ret b (precise STrue (k, v) r)
Starting from my non-lazy implementation, I first refactored precise
to receive Just
as an argument, and generalized its type accordingly:
data TreeMap v = Leaf | Node Integer v (TreeMap v) (TreeMap v) deriving (Show, Read, Eq, Ord)
closestLess :: Integer -> TreeMap v -> Maybe (Integer, v)
closestLess i = precise Just Nothing where
precise :: ((Integer, v) -> t) -> t -> TreeMap v -> t
precise _ closestSoFar Leaf = closestSoFar
precise wrap closestSoFar (Node k v l r) = case i `compare` k of
LT -> precise wrap closestSoFar l
EQ -> wrap (k, v)
GT -> precise wrap (wrap (k, v)) r
Then, I changed it to do wrap
early and call itself with id
in the GT
case:
data TreeMap v = Leaf | Node Integer v (TreeMap v) (TreeMap v) deriving (Show, Read, Eq, Ord)
closestLess :: Integer -> TreeMap v -> Maybe (Integer, v)
closestLess i = precise Just Nothing where
precise :: ((Integer, v) -> t) -> t -> TreeMap v -> t
precise _ closestSoFar Leaf = closestSoFar
precise wrap closestSoFar (Node k v l r) = case i `compare` k of
LT -> precise wrap closestSoFar l
EQ -> wrap (k, v)
GT -> wrap (precise id (k, v) r)
This still works exactly as before, except for the benefit of the added laziness.
How about
GT -> let Just v = precise (Just (k,v) r) in Just v
?
Not only do we always know Just
, after its first discovery, we also always know Nothing
until then. That's actually two different "logics".
So, we go left first of all, so make that explicit:
data TreeMap v = Leaf | Node Integer v (TreeMap v) (TreeMap v)
deriving (Show, Read, Eq, Ord)
closestLess :: Integer
-> TreeMap v
-> Maybe (Integer, v)
closestLess i = goLeft
where
goLeft :: TreeMap v -> Maybe (Integer, v)
goLeft n@(Node k v l _) = case i `compare` k of
LT -> goLeft l
_ -> Just (precise (k, v) n)
goLeft Leaf = Nothing
-- no more maybe if we're here
precise :: (Integer, v) -> TreeMap v -> (Integer, v)
precise closestSoFar Leaf = closestSoFar
precise closestSoFar (Node k v l r) = case i `compare` k of
LT -> precise closestSoFar l
EQ -> (k, v)
GT -> precise (k, v) r
The price is we repeat at most one step at most once.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With