Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I pattern match with Data.Text in Haskell?

I am currently in the process of writing a parser in Haskell. I have the following code.

{-# LANGUAGE LambdaCase #-}
{-# LANGUAGE OverloadedStrings #-}

module Main where

import Data.Text

newtype Parser a = Parser { runParser :: Text -> Either Text (Text, a) }

char1 :: Char -> Parser Char
char1 c = Parser $ \case
    (x:xs) | x == c -> Right (xs, x)
    _               -> Left "Unexpected character"

It fails to compile with these two errors.

test.hs:12:6: error:
    • Couldn't match expected type ‘Text’ with actual type ‘[Char]’
    • In the pattern: x : xs
      In a case alternative: (x : xs) | x == c -> Right (xs, x)
      In the second argument of ‘($)’, namely
        ‘\case
           (x : xs) | x == c -> Right (xs, x)
           _ -> Left "Unexpected character"’
   |
12 |     (x:xs) | x == c -> Right (xs, x)
   |      ^^^^

test.hs:12:24: error:
    • Couldn't match type ‘[Char]’ with ‘Text’
      Expected type: Either Text (Text, Char)
        Actual type: Either Text ([Char], Char)
    • In the expression: Right (xs, x)
      In a case alternative: (x : xs) | x == c -> Right (xs, x)
      In the second argument of ‘($)’, namely
        ‘\case
           (x : xs) | x == c -> Right (xs, x)
           _ -> Left "Unexpected character"’
   |
12 |     (x:xs) | x == c -> Right (xs, x)
   |                        ^^^^^^^^^^^^^

I can fix the error by replacing the Text data type with String but I would prefer to use the Text data type.

Is there a way to pattern match with the Data.Text type without first explicitly converting it to a string first? Perhaps there is a GHC extension that would allow me to do this?

Thanks in advance.

like image 958
joe Avatar asked Feb 11 '21 19:02

joe


People also ask

How do you do pattern matching in Haskell?

You do that by putting a name and an @ in front of a pattern. For instance, the pattern xs@(x:y:ys). This pattern will match exactly the same thing as x:y:ys but you can easily get the whole list via xs instead of repeating yourself by typing out x:y:ys in the function body again.

What is text pattern matching?

Pattern matching is the process of checking whether a specific sequence of characters/tokens/data exists among the given data. Regular programming languages make use of regular expressions (regex) for pattern matching.

What is otherwise in Haskell?

It's not equivalent to _ , it's equivalent to any other identifier. That is if an identifier is used as a pattern in Haskell, the pattern always matches and the matched value is bound to that identifier (unlike _ where it also always matches, but the matched value is discarded).


2 Answers

A a refinement to @DanielWagner's answer, you can combine view patterns and pattern synonyms to do this. You'll need a new constructor in place of :, but it might look like:

{-# LANGUAGE PatternSynonyms #-}
{-# LANGUAGE ViewPatterns #-}

import Data.Text

pattern x :> xs <- (uncons -> Just (x, xs))
pattern Empty <- (uncons -> Nothing)

findInText :: (Char -> Bool) -> Text -> Maybe Char
findInText _ Empty = Nothing
findInText p (x :> xs) | p x = Just x
                       | otherwise = findInText p xs

The idea here is that a pattern x :> xs is a synonym for the pattern uncons -> Just (x, xs) which is a view pattern that operates by applying uncons to the scrutinee and pattern-matching the result with Just (x, xs) to population x and xs for the parent pattern.

As per the comment, there might be some concern about whether this usage ends up calling uncons more than once. With optimization entirely shut off (-O0), the generated core does have multiple uncons calls:

-- unoptimized -O0
findInText
  = \ ds ds1 ->
      case uncons ds1 of {
        Nothing -> Nothing;
        Just ipv ->
          case uncons ds1 of {
            Nothing -> ...

With optimization on (-O or -O2), everything gets inlined and the generated core is incredibly complicated because of the Unicode processing going on. However, if you also define:

findInText' :: (Char -> Bool) -> Text -> Maybe Char
findInText' p txt = case uncons txt of
  Nothing -> Nothing
  Just (x, xs) | p x -> Just x
               | otherwise -> findInText' p xs

it turns out that GHC compiles findInText' to:

findInText' = findInText

so it looks like in this case at least, GHC doesn't do any extra work as a result of the view patterns.

like image 99
K. A. Buhr Avatar answered Oct 20 '22 19:10

K. A. Buhr


You can match on a call to uncons.

case uncons text of
    Just (x, xs) -> ...
    Nothing -> ...

View patterns let you do this within the pattern instead of within the scrutinee, but require you to say uncons once for each pattern.

case text of
    (uncons -> Just (x, xs)) -> ...
    (uncons -> Nothing) -> ...
like image 43
Daniel Wagner Avatar answered Oct 20 '22 20:10

Daniel Wagner