I'm using QuickCheck to generate arbitrary functions, and I'd like to generate arbitrary injective functions (i.e. f a == f b
if and only if a == b
).
I thought I had it figured out:
newtype Injective = Injective (Fun Word Char) deriving Show
instance Arbitrary Injective where
arbitrary = fmap Injective fun
where
fun :: Gen (Fun Word Char)
fun = do
a <- arbitrary
b <- arbitrary
arbitrary `suchThat` \(Fn f) ->
(f a /= f b) || (a == b)
But I'm seeing cases where the generated function maps distinct inputs to the same output.
What I want:
What I think I have:
How can I fix this?
You've correctly identified the problem: what you're generating is functions with the property ∃ a≠b. f a≠f b
(which is readily true for most random functions anyway), whereas what you want is ∀ a≠b. f a≠f b
. That is a much more difficult property to ensure, because you need to know about all the other function values for generating each individual one.
I don't think this is possible to ensure for general input types, however for word specifically what you can do is “fake” a function by precomputing all the output values sequentially, making sure that you don't repeat one that has already been done, and then just reading off from that predetermined chart. It requires a bit of laziness fu to actually get this working:
import qualified Data.Set as Set
newtype Injective = Injective ([Char] {- simply a list without duplicates -})
deriving Show
instance Arbitrary Injective where
arbitrary = Injective . lazyNub <$> arbitrary
lazyNub :: Ord a => [a] -> [a]
lazyNub = go Set.empty
where go _ [] = []
go forbidden (x:xs)
| x `Set.member` forbidden = go forbidden xs
| otherwise = x : go (Set.insert x forbidden) xs
This is not very efficient, and may well not be ok for your application, but it's probably the best you can do.
In practice, to actually use Injective
as a function, you'll want to wrap the values in a suitable structure that has only O (log n) lookup time. Unfortunately, Data.Map.Lazy
is not lazy enough, you may need to hand-bake something like a list of exponentially-growing maps.
There's also the concern that for some insufficiently big result types, it is just not possible to generate injective functions because there aren't enough values available. In fact as Joseph remarked, this is the case here. The lazyNub
function will go into an infinite loop in this case. I'd say for a QuickCheck this is probably ok though.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With