Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fast filtering over Bounded Enum type

Let's say I have a function that takes some predicate f and applies it to the whole domain:

someFilter :: (Enum a, Bounded a) => (a -> Bool) -> [a]
someFilter f = filter f [minBound ..]

I also know that this predicate f returns True only for a relatively small number of arguments, e.g.

isA c = (c == 'A')

Is it possible to somehow make someFilter fast, i.e. not dependent on the number of possible values of type a, without changing its signature?

like image 350
mariusz Avatar asked May 20 '26 08:05

mariusz


2 Answers

You can't really do this in general, since all you've given the function is that a is Enum and Bounded, which means you'd never be able to tell if the predicate even used == (not that you can do this anyway). Instead, you can write a custom data type that can be inspected before performing the filter:

import Control.Applicative

infixr 2 `Or`
infixr 3 `And`
data Pred a
    = LessThan a
    | GreaterThan a
    | EqualTo a
    | Generic (a -> Bool)
    | And (Pred a) (Pred a)
    | Or  (Pred a) (Pred a)

someFilter :: (Enum a, Bounded a, Ord a) => Pred a -> [a]
someFilter p = go p [minBound ..]
    where
        go :: (Enum a, Bounded a, Ord a) => Pred a -> [a] -> [a]
        go (LessThan x)    = takeWhile (< x)
        go (GreaterThan x) = dropWhile (<= x)
        go (EqualTo x)     = const [x]
        go (Generic f)     = filter f
        go (And p q)       = go q . go p
        go (Or  p q)       = (++) <$> go p <*> go q

naiveFilter :: (Enum a, Bounded a) => (a -> Bool) -> [a]
naiveFilter f = filter f [minBound ..]

This gives you most of the tools you need (not all of them, this is just a rough outline) to recreate basic boolean combinations of ordering predicates with support for generic predicates. For certain cases someFilter will perform much, much better than naiveFilter, such as for someFilter (LessThan 128) :: [Word32] versus naiveFilter (< 128) :: [Word32]. On my computer the former finishes in essentially 0s, while the latter I didn't even let run to completion. You can also run queries like someFilter (And (LessThan 128) (GreaterThan 256)) :: [Word32] that finish immediately, but would take forever with naiveFilter. I've used Word32 instead of Int here to avoid the problem of negative numbers.

If you want to combine a bunch of predicates together you can do

allPs, anyPs :: [Pred a] -> Pred a
allPs = foldr1 And
anyPs = foldr1 Or

Or you could try building a balanced tree out of it, but this will suffice for simplicity. It does let you express rather complicated queries fairly easy:

> :set +s
> someFilter $ EqualTo 1 `Or` EqualTo 100 `Or` LessThan 1000 `And` GreaterThan 975 :: [Word32]
[1,100,976,977,978,979,980,981,982,983,984,985,986,987,988,989,990,991,992,993,994,995,996,997,998,999]
(0.00 secs, 0 bytes)

Be warned that order of operations matters here because of takeWhile and dropWhile, and you can also end up with duplicates (try making a greater-than-or-equals and less-than-or-equals, you'll see what I mean). These fixes will start to eat at the performance of this algorithm, but in general you should see no worse performance than naiveFilter.

like image 94
bheklilr Avatar answered May 22 '26 23:05

bheklilr


Consider a being Word160 and f being (== w) . sha1 for a fixed word w :: Word160, where sha1 :: Word160 -> Word160 is some implementation of SHA-1. Then the predicate is true only for a very small number of inputs (most likely 1). If you had a fast way how to find matching values, you'd be constructing an inverse function to SHA-1, which would allow you to crack digital signatures and all sorts of similar stuff. So no, there is no efficient way how to do this, unless you restrict the predicate somehow.

like image 30
Petr Avatar answered May 23 '26 00:05

Petr



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!