I am currently in the process of writing a parser in Haskell. I have the following code.
{-# LANGUAGE LambdaCase #-}
{-# LANGUAGE OverloadedStrings #-}
module Main where
import Data.Text
newtype Parser a = Parser { runParser :: Text -> Either Text (Text, a) }
char1 :: Char -> Parser Char
char1 c = Parser $ \case
(x:xs) | x == c -> Right (xs, x)
_ -> Left "Unexpected character"
It fails to compile with these two errors.
test.hs:12:6: error:
• Couldn't match expected type ‘Text’ with actual type ‘[Char]’
• In the pattern: x : xs
In a case alternative: (x : xs) | x == c -> Right (xs, x)
In the second argument of ‘($)’, namely
‘\case
(x : xs) | x == c -> Right (xs, x)
_ -> Left "Unexpected character"’
|
12 | (x:xs) | x == c -> Right (xs, x)
| ^^^^
test.hs:12:24: error:
• Couldn't match type ‘[Char]’ with ‘Text’
Expected type: Either Text (Text, Char)
Actual type: Either Text ([Char], Char)
• In the expression: Right (xs, x)
In a case alternative: (x : xs) | x == c -> Right (xs, x)
In the second argument of ‘($)’, namely
‘\case
(x : xs) | x == c -> Right (xs, x)
_ -> Left "Unexpected character"’
|
12 | (x:xs) | x == c -> Right (xs, x)
| ^^^^^^^^^^^^^
I can fix the error by replacing the Text
data type with String
but I would prefer to use the Text
data type.
Is there a way to pattern match with the Data.Text
type without first explicitly converting it to a string first? Perhaps there is a GHC extension that would allow me to do this?
Thanks in advance.
You do that by putting a name and an @ in front of a pattern. For instance, the pattern xs@(x:y:ys). This pattern will match exactly the same thing as x:y:ys but you can easily get the whole list via xs instead of repeating yourself by typing out x:y:ys in the function body again.
Pattern matching is the process of checking whether a specific sequence of characters/tokens/data exists among the given data. Regular programming languages make use of regular expressions (regex) for pattern matching.
It's not equivalent to _ , it's equivalent to any other identifier. That is if an identifier is used as a pattern in Haskell, the pattern always matches and the matched value is bound to that identifier (unlike _ where it also always matches, but the matched value is discarded).
A a refinement to @DanielWagner's answer, you can combine view patterns and pattern synonyms to do this. You'll need a new constructor in place of :
, but it might look like:
{-# LANGUAGE PatternSynonyms #-}
{-# LANGUAGE ViewPatterns #-}
import Data.Text
pattern x :> xs <- (uncons -> Just (x, xs))
pattern Empty <- (uncons -> Nothing)
findInText :: (Char -> Bool) -> Text -> Maybe Char
findInText _ Empty = Nothing
findInText p (x :> xs) | p x = Just x
| otherwise = findInText p xs
The idea here is that a pattern x :> xs
is a synonym for the pattern uncons -> Just (x, xs)
which is a view pattern that operates by applying uncons
to the scrutinee and pattern-matching the result with Just (x, xs)
to population x
and xs
for the parent pattern.
As per the comment, there might be some concern about whether this usage ends up calling uncons
more than once. With optimization entirely shut off (-O0
), the generated core does have multiple uncons
calls:
-- unoptimized -O0
findInText
= \ ds ds1 ->
case uncons ds1 of {
Nothing -> Nothing;
Just ipv ->
case uncons ds1 of {
Nothing -> ...
With optimization on (-O
or -O2
), everything gets inlined and the generated core is incredibly complicated because of the Unicode processing going on. However, if you also define:
findInText' :: (Char -> Bool) -> Text -> Maybe Char
findInText' p txt = case uncons txt of
Nothing -> Nothing
Just (x, xs) | p x -> Just x
| otherwise -> findInText' p xs
it turns out that GHC compiles findInText'
to:
findInText' = findInText
so it looks like in this case at least, GHC doesn't do any extra work as a result of the view patterns.
You can match on a call to uncons
.
case uncons text of
Just (x, xs) -> ...
Nothing -> ...
View patterns let you do this within the pattern instead of within the scrutinee, but require you to say uncons
once for each pattern.
case text of
(uncons -> Just (x, xs)) -> ...
(uncons -> Nothing) -> ...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With