Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I use a Parsec parser which has a different stream type than another Parsec parser?

Tags:

haskell

parsec

I have a parser written with Text as the stream type, while by default the Text.Parsec.String module uses String otherwise.

How can I use the custom written parser (Parsec Text b c) in the context of Parsec String b c?

Essentially it seems I would need such a function:

f :: Parsec Text b c -> Parsec String b c
f = undefined

Although it sounds possible, it seems like it might be quite complex to do.

like image 565
Chris Stryczynski Avatar asked Aug 08 '20 15:08

Chris Stryczynski


1 Answers

It's gruesome, but relatively straightforward. The idea is to use the low-level functions runParsecT and mkPT to deconstruct and reconstruct the parser, bracketing it with adapters to modify the stream type of the incoming and outgoing state:

import Text.Parsec
import Data.Text (Text)
import qualified Data.Text as Text

stringParser :: (Monad m) => ParsecT Text u m a -> ParsecT String u m a
stringParser p = mkPT $ \st -> (fmap . fmap . fmap) outReply $ runParsecT p (inState st)
  where inState :: State String u -> State Text u
        inState  (State i pos u) = State (Text.pack i) pos u
        outReply :: Reply Text u a -> Reply String u a
        outReply (Ok a (State i pos u) e) = Ok a (State (Text.unpack i) pos u) e
        outReply (Error e) = Error e

It seems to work okay:

myTextParser :: Parsec Text () String
myTextParser = (:) <$> oneOf "abc" <*> many letter

myStringParser :: Parsec String () (String, String)
myStringParser = (,) <$> p <* spaces <*> p
  where p = stringParser myTextParser

main = do
  print =<< parseTest myStringParser "avocado butter"
  print =<< parseTest myStringParser "apple error"

giving:

λ> main
("avocado","butter")
()
parse error at (line 1, column 7):
unexpected "e"
expecting space
()

HOWEVER, there are likely to be some serious performance problems here, unless this is being used in a small, toy parser. The pack calls will take the entire incoming stream and convert it to a Text value. If you are parsing from a lazy String (e.g., from a lazy I/O call), the first use of a converted parser will read the entire string into memory as a Text and pump it back out as a String; further calls to the same parser will re-pack the remaining stream as Text each time. Switching to lazy Text won't really help, since pack still packs the whole input into the "lazy" Text value.

You'll need to run some tests/benchmarks to see if this performance hit is acceptable in your application. Generally speaking, rewriting the Text parser (or seeing if it will compile with an abstract stream type) will be a better approach.

Full code example:

{-# OPTIONS_GHC -Wall #-}

import Text.Parsec
import Data.Text (Text)
import qualified Data.Text as Text

stringParser :: (Monad m) => ParsecT Text u m a -> ParsecT String u m a
stringParser p = mkPT $ \st -> (fmap . fmap . fmap) outReply $ runParsecT p (inState st)
  where inState :: State String u -> State Text u
        inState  (State i pos u) = State (Text.pack i) pos u
        outReply :: Reply Text u a -> Reply String u a
        outReply (Ok a (State i pos u) e) = Ok a (State (Text.unpack i) pos u) e
        outReply (Error e) = Error e

myTextParser :: Parsec Text () String
myTextParser = (:) <$> oneOf "abc" <*> many letter

myStringParser :: Parsec String () (String, String)
myStringParser = (,) <$> p <* spaces <*> p
  where p = stringParser myTextParser

main = do
  print =<< parseTest myStringParser "avocado butter"
  print =<< parseTest myStringParser "apple error"
like image 196
K. A. Buhr Avatar answered Oct 03 '22 21:10

K. A. Buhr