Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why do some Data.List.Split functions use `Int` and others `Integral a`?

Tags:

haskell

Here are the type signatures of chunksOf and splitPlaces from Data.List.Split:

chunksOf :: Int -> [e] -> [[e]]
splitPlaces :: Integral a => [a] -> [e] -> [[e]]

Why do some functions (like chunksOf) use Int while others (like splitPlaces) use the more generic Integral a?

like image 316
Paul Manta Avatar asked Aug 25 '13 09:08

Paul Manta


Video Answer


1 Answers

In this answer, I try to look at the historical reasons for the inconsistent interface. Summary: It seems that Brent made some functions more generic on the fly to help formulating QuickCheck properties.

Why is splitPlaces generic?

It looks as if Brent generalized the type of splitPlaces to make it easier to introduce QuickCheck properties for that function. The QuickCheck properties use newtype wrappers to control test case generation, and with the Integral a constraint, splitPlaces could look through this newtype wrapper for the arithmetic operations. See also:

  • The patch that generalizes the type of splitPlaces and introduces the QuickCheck properties.

  • The QuickCheck documentation about these newtype wrappers.

However, here is one of the properties about splitPlaces:

prop_splitPlaces_preserve :: [NonNegative Integer] -> [Elt] -> Bool
prop_splitPlaces_preserve ps l = concat (splitPlaces ps' l) == genericTake (sum ps') l
  where ps' = map unNN ps

Note that QuickCheck automatically generates the list ps which is converted by map unNN ps before being passed to splitPlaces. The unNN function removes the NonNegative wrapper, so splitPlaces does not have to deal with the NonNegative wrapper itself. However, it receives an argument of type [Integer] instead of [Int], so it still needs to be generic in the numeric type.

What's the point of using [NonNegative Integer] instead of [NonNegative Int]?

I suspect that the property is false for [Int] because of arithmetic overflows when computing the sum. The property might even be falsifiable by QuickCheck because the Arbitrary [NonNegative Integer] instance will ultimately delegate to arbitrarySizedBoundedIntegral which can generate very large values.

I guess that using [NonNegative Integer] instead circumvents this problem in two ways:

  1. With Integer, no overflows can occur.
  2. The Arbitrary Integer instance delegates to arbitrarySizedIntegral which only generates small values.

So I guess that the reason for allowing arbitrary Integral types is that the QuickCheck property would fail for Int but succeeds for Integer.

Why is chunksOf not generic?

The properties for chunksOf use pattern matching to remove the newtype wrappers. See also:

  • The patch that introduces properties for splitEvery.

  • The patch that renames splitEvery to chunksOf.

Here is one of the properties about chunksOf:

prop_chunksOf_all_n :: Positive Int -> NonEmptyList Elt -> Bool
prop_chunksOf_all_n (Positive n) (NonEmpty l) = all ((==n) . length) (init $ chunksOf n l)

Note that this property matches on the arguments that are automatically generated by QuickCheck and passes them to chunksOf without the newtype wrapper. For the arguments necessary for testing chunksOf, this is easy to do, because the numbers are not nested in other types. Compare with prop_splitPlaces_preserve above, where converting [NonNegative Integer] to [Integer] or [Int] requires something more complicated than just pattern matching.

The difference between Arbitrary Int and Arbitrary Integer doesn't matter here, because the property does not involve any operations that can trigger an arithmetic overflow.

like image 136
Toxaris Avatar answered Sep 19 '22 21:09

Toxaris