In the module Text.ParserCombinators.ReadP
, munch
(and munch1
) documentation says:
Parses the first zero or more characters satisfying the predicate. Always succeds, exactly once having consumed all the characters Hence NOT the same as
(many (satisfy p))
.
How are they different?
First, let's find a counter-example and let's use Haskell tools to do it automatically. The QuickCheck library can give us such an counter-example very quickly:
import Data.Function (on)
import Text.ParserCombinators.ReadP
import Test.QuickCheck
prop_ParseTest :: String -> Property
prop_ParseTest input = on (===) runParser (munch pred) (many (satisfy pred))
where
runParser :: ReadP a -> [(a, String)]
runParser = flip readP_to_S input -- run a given parser on a given input
pred :: Char -> Bool
pred = (> 'A')
main :: IO ()
main = quickCheck prop_ParseTest
We ask it to test, whether the two parsers much pred
and many (satisfy pred)
are the same. QuickCheck immediately finds that they're different and tries to produce as short counter-example as possible:
*** Failed! Falsifiable (after 5 tests and 2 shrinks):
[("a","")] /= [("","a"),("a","")]
So munch pred
always consumes 'a'
unconditionally, while many (satisfy pred)
gives a nondeterministic answer - it might or might not not consume 'a'
.
For example, consider running the following two parsers on string "abcd"
:
test1 = munch (> 'A') >> string "cd"
test2 = many (satisfy (> 'A')) >> string "cd"
The first fails, because munch
consumes the whole string and then it's not possible to match "cd"
. The second one succeeds, because many (satisfy ...)
creates all possible branches
[("","abcd"),("a","bcd"),("ab","cd"),("abc","d"),("abcd","")]
and string "cd"
succeeds on the branch that consumed "ab"
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With