Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to deal with false positive on non-exhaustive pattern matches?

Say you have an elaborate set of functions, and knowing your design, you can tell that some combination of function and parameters never happen. That is something that is actually inferable by the compiler if it wanted to.

For the sake of clarity, take this example (don't tell me use map, it's an example):

processAll :: [Int] -> [Int]
processAll [] = []
processAll a = let (x, xs) = processOne a in x:processAll xs
    where
        processOne (x:xs) = (x+1,xs)

In this example, it is very obvious that processOne can never be called with an empty list. Compiling with ghc and adding -Wall warns that:

Pattern match(es) are non-exhaustive
In an equation for `processOne': Patterns not matched: []

Of course, I wouldn't want to disable such warnings in general because I may have actually missed a pattern match somewhere else. However, I would have expected ghc to be able to infer that this pattern list in fact is exhaustive in its domain.

The alternative solution to disabling warnings would be:

processAll :: [Int] -> [Int]
processAll [] = []
processAll a = let (x, xs) = processOne a in x:processAll xs
    where
        processOne (x:xs) = (x+1,xs)
        processOne _ = error "processor overheat - explosion imminent"

which is both redundant (because processOne [] would have resulted in error anyway) and tedious.

How should one generally handle this situation? Go ahead and add error messages on every impossible case?


In this particular example, I know there are better ways to handle this, for example by having the caller match on the pattern. So if you want here is another example that is a very simplified extract from a lexer I'm writing which you can run, too:

import Data.Char (isNumber, isAlpha)
import Control.Monad

data TokenType = ParenOpen          -- (
                | ParenClose        -- )
                | Plus              -- +
                | Number String     -- A number
                | Variable String   -- Anything else
                | End               -- End of the stream
               deriving (Show, Eq)

-- content is the content of a file from a line and column on
type Content = (String, Int, Int)

-- a token is a token and its position as matched by the lexer
type Token = (TokenType, Int, Int)

lexer :: String -> [Token]
lexer = lexAll . (\s -> (s, 1, 1))
    where
        -- make a maybe value based on a Bool
        makeMaybe :: Bool -> a -> Maybe a
        makeMaybe p x = if p then return x else Nothing

        -- advance the content by one, taking care of line and column numbers
        advance :: Content -> Content
        advance (x:xs, l, c) = (xs, l', c')
                            where
                                l' = if x == '\n' then l + 1 else l
                                c' = if x == '\n' then 1 else c + 1

        -- advance the content by n
        advance' n content = iterate advance content !! n

        -- match a single character
        matchExact :: Char -> Content -> Maybe Content
        matchExact y content@(x:_, _, _) = makeMaybe (x == y) $ advance content

        -- match while pattern holds for characters
        matchPattern :: (Char -> Bool) -> Content -> Maybe (String, Content)
        matchPattern p content@(xs, _, _) = makeMaybe (len > 0) (pfx, advance' len content)
                                    where
                                        pfx = takeWhile p xs
                                        len = length pfx

        matchParenOpen = matchExact '(' >=> (\c -> return (ParenOpen, c))
        matchParenClose = matchExact ')' >=> (\c -> return (ParenClose, c))
        matchPlus = matchExact '+' >=> (\c -> return (Plus, c))
        matchNumber = matchPattern isNumber >=> (\(s, c) -> return (Number s, c))
        matchVariable = matchPattern isAlpha >=> (\(s, c) -> return (Variable s, c))

        lexOne :: Content -> (Token, Content)
        lexOne cur@([], l, c) = ((End, l, c), cur)
        lexOne cur@(_, l, c) = let tokenMatchers = [matchParenOpen,
                                                      matchParenClose,
                                                      matchPlus,
                                                      matchNumber,
                                                      matchVariable
                                                     ] in
                                case msum $ map ($ cur) tokenMatchers of
                                    -- if nothing could be matched, generate an error and skip the character
                                    Nothing -> lexOne $ advance cur
                                    -- otherwise, this is an interesting token
                                    Just (t, cnt) -> ((t, l, c), cnt)

        lexAll :: Content -> [Token]
        lexAll ([], _, _) = []
        lexAll content = token:lexAll rest
                    where
                        (token, rest) = lexOne content

main :: IO ()
main = getContents >>= putStrLn . unlines . map (\(t, l, c) -> show l ++ ":" ++ show c ++ ": " ++ show t) . lexer

In the example above, lexOne ensures that none of the match* functions, and consequently the advance* functions are given a Content with an empty string. ghc warns that:

Pattern match(es) are non-exhaustive
In an equation for `advance': Patterns not matched: ([], _, _)

Pattern match(es) are non-exhaustive
In an equation for `matchExact': Patterns not matched: _ ([], _, _)

which I can tell for sure would never happen. What is the correct way of handling this issue?

like image 379
Shahbaz Avatar asked Jan 18 '26 21:01

Shahbaz


2 Answers

Why not just add a type for NonEmptyContent?

module SO24967745 where
import Control.Monad
import Data.Char

data TokenType = ParenOpen          -- (
                | ParenClose        -- )
                | Plus              -- +
                | Number String     -- A number
                | Variable String   -- Anything else
                | End               -- End of the stream
               deriving (Show, Eq)

-- content is the content of a file from a line and column on
type Content = (String, Int, Int)
type NonEmptyContent = (Char, String, Int, Int)

-- a token is a token and its position as matched by the lexer
type Token = (TokenType, Int, Int)

lexer :: String -> [Token]
lexer = lexAll . (\s -> (s, 1, 1))
    where
        -- make a maybe value based on a Bool
        makeMaybe :: Bool -> a -> Maybe a
        makeMaybe p x = if p then return x else Nothing

        toNonEmptyContent :: Content -> Maybe NonEmptyContent
        toNonEmptyContent ([], _, _) = Nothing
        toNonEmptyContent (x:xs,l,c) = Just (x,xs,l,c)

        toContent :: NonEmptyContent -> Content
        toContent (x, xs, l, c) = (x:xs, l, c)

        -- advance the content by one, taking care of line and column numbers
        advance :: NonEmptyContent -> Content
        advance (x, xs, l, c) = (xs, l', c')
                            where
                                l' = if x == '\n' then l + 1 else l
                                c' = if x == '\n' then 1 else c + 1

        -- advance the content by n
        advance' :: Int -> NonEmptyContent -> Maybe Content
        advance' n = foldr (>=>) Just (replicate n (fmap advance . toNonEmptyContent)) . toContent

        -- match a single character
        matchExact :: Char -> NonEmptyContent -> Maybe Content
        matchExact y content@(x,_, _, _) = makeMaybe (x == y) $ advance content

        -- match while pattern holds for characters
        matchPattern :: (Char -> Bool) -> NonEmptyContent -> Maybe (String, Content)
        matchPattern p content@(x,xs, _, _) = do
          let pfx = takeWhile p (x:xs)
              len = length pfx
          guard (len > 0) 
          content' <- advance' len content
          return (pfx, content')

        matchParenOpen = matchExact '(' >=> (\c -> return (ParenOpen, c))
        matchParenClose = matchExact ')' >=> (\c -> return (ParenClose, c))
        matchPlus = matchExact '+' >=> (\c -> return (Plus, c))
        matchNumber = matchPattern isNumber >=> (\(s, c) -> return (Number s, c))
        matchVariable = matchPattern isAlpha >=> (\(s, c) -> return (Variable s, c))

        lexOne :: Content -> (Token, Content)
        lexOne cur@([], l, c) = ((End, l, c), cur)
        lexOne (x:xs, l, c)   = let cur = (x,xs,l,c)
                                    tokenMatchers = [matchParenOpen,
                                                      matchParenClose,
                                                      matchPlus,
                                                      matchNumber,
                                                      matchVariable
                                                     ] in
                                case msum $ map ($ cur) tokenMatchers of
                                    -- if nothing could be matched, generate an error and skip the character
                                    Nothing -> lexOne $ advance cur
                                    -- otherwise, this is an interesting token
                                    Just (t, cnt) -> ((t, l, c), cnt)

        lexAll :: Content -> [Token]
        lexAll ([], _, _) = []
        lexAll content = token:lexAll rest
                    where
                        (token, rest) = lexOne content

main :: IO ()
main = getContents >>= putStrLn . unlines . map (\(t, l, c) -> show l ++ ":" ++ show c ++ ": " ++ show t) . lexer
like image 195
rampion Avatar answered Jan 20 '26 19:01

rampion


Even if the warning is indeed a false positive, you could take it as a hint thta your code is not entirely clear and take it as an opportunity to write more clear code. For example:

processAll :: [Int] -> [Int]
processAll [] = []
processAll (a:as) = let (x, xs) = processOne a as in x:processAll xs
where
    processOne x xs = (x+1,xs)

Benefit: You have a canonical, complete set of list patterns in the outer function. And the inner one reflects the fact that at least one value of type a is required.

Looking at the types, the type of the inner function is now

 a -> b -> (a,b)

instead of

 [a] -> (a, [a])

Clearly, this latter type alone shows that your previous version was non total.

like image 30
Ingo Avatar answered Jan 20 '26 19:01

Ingo



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!