Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

QuickCheck giving up investigating a recursive data structure (rose tree.)

Given an arbitrary tree, I can construct a subtype relation over that tree, using Schubert numbering:

constructH :: Tree a -> Tree (Type a)

where Type nests the original label, and additionally provides the data needed to perform child/parent (or subtype) checks. With Schubert Numbering, the two Int parameters are sufficient for that.

data Type a where !Int -> !Int -> a -> Type a

This leads to the binary predicate

subtypeOf :: Type a -> Type a -> Bool

I now want to test with QuickCheck that this does indeed do what I want it to do. The following property, however, does not work, because QuickCheck just gives up:

subtypeSanity ∷ Tree (Type ()) → Gen Prop
subtypeSanity Node { rootLabel = t, subForest = f } =
  let subtypes = concatMap flatten f
  in (not $ null subtypes) ==> conjoin
     (forAll (elements subtypes) (\x → x `subtypeOf` t):(map subtypeSanity f))

If I leave out the recursive call to subtypeSanity, i.e. the tail of the list I'm passing to conjoin, the property runs fine, but tests just the root node of the tree! How can I descend into my data structure recursively without QuickCheck giving up on generating new test cases?

If needed, I could provide the code to construct the Schubert Hierarchy, and the Arbitrary instance for Tree (Type a), to provide a complete runnable example, but that would be quite a bit of code. I'm convinced that I'm just not "getting" QuickCheck, and using it in the wrong way here.

EDIT: unfortunately, the sized function does not seem to eliminate the problem here. It ends up with the same result (see comment to J. Abrahamson's answer.)

EDIT II: I ended up "fixing" my problem by avoiding the recursive step, and avoiding conjoin. We just make a list of all nodes in the tree, then test the single-node property (which worked fine from the beginning) on those.

allNodes ∷ Tree a → [Tree a]
allNodes n@(Node { subForest = f }) = n:(concatMap allNodes f)

subtypeSanity ∷ Tree (Type ()) → Gen Prop
subtypeSanity tree = forAll (elements $ allNodes tree)
  (\(Node { rootLabel = t, subForest = f }) →
    let subtypes = concatMap flatten f
    in (not $ null subtypes) ==> forAll (elements subtypes) (\x → x `subtypeOf` t))

Tweaking the Arbitrary instance for trees did not work. Here is the arbitrary instance I'm still using:

instance (Arbitrary a, Eq a) ⇒ Arbitrary (Tree (Type a)) where
  arbitrary = liftM (constructH) $ sized arbTree

arbTree ∷ Arbitrary a ⇒ Int → Gen (Tree a)
arbTree n = do
  m ← choose (0,n)
  if m == 0
    then Node <$> arbitrary <*> (return [])
    else do part ← randomPartition n m
            Node <$> arbitrary <*> mapM arbTree part

-- this is a crude way to find a sufficiently random x1,..,xm,
-- such that x1 + .. + xm = n, for any n, m, with 0 < m.
randomPartition ∷ Int → Int → Gen [Int]
randomPartition n m' = do
  let m = m' - 1
  seed ← liftM ((++[n]) . sort) $ replicateM m (choose (0,n))
  return $ zipWith (-) seed (0:seed)

I consider the problem "solved for now," but if someone could explain to me why the recursive step and/or conjoin made QuickCheck give up (after passing "only" 0 tests,) I would be more than grateful.

like image 459
Aleksandar Dimitrov Avatar asked Dec 03 '13 13:12

Aleksandar Dimitrov


1 Answers

When generating Arbitrary recursive structures, QuickCheck is often a bit too eager and generates sprawling, enormous random examples. These are undesirable as they usually don't better check the properties of interest and can be very slow. Two solutions are

  1. Use things like the size parameter (sized function) and frequency function to bias the generator toward small trees.

  2. Use a small-type oriented generator like those in smallcheck. These try to exhaustively generate all "small" examples and thus help to keep the size of the tree down.

To clarify the sized and frequency method of controlling generation size, here's an example RoseTree

data Rose a = It a | Rose [Rose a]

instance Arbitrary a => Arbitrary (Rose a) where
  arbitrary = frequency 
    [ (3, It <$> arbitrary)                   -- The 3-to-1 ratio is chosen, ah,
                                              -- arbitrarily...
                                              -- you'll want to tune it
    , (1, Rose <$> children)
    ]
    where children = sized $ \n -> vectorOf n arbitrary

It can be done even more simply with a different Rose formation by very carefully controlling the size of the child list

data Rose a = Rose a [Rose a]

instance Arbitrary a => Arbitrary (Rose a) where
  arbitrary = Rose <$> arbitrary <*> sized (\n -> vectorOf (tuneUp n) arbitrary)
    where tuneUp n = round $ fromIntegral n / 4.0

You could do this without referencing sized, but that gives the user of your Arbitrary instance a knob to ask for larger trees if needed.

like image 73
J. Abrahamson Avatar answered Sep 28 '22 14:09

J. Abrahamson