I have a parser written with Text
as the stream type, while by default the Text.Parsec.String
module uses String
otherwise.
How can I use the custom written parser (Parsec Text b c
) in the context of Parsec String b c
?
Essentially it seems I would need such a function:
f :: Parsec Text b c -> Parsec String b c
f = undefined
Although it sounds possible, it seems like it might be quite complex to do.
It's gruesome, but relatively straightforward. The idea is to use the low-level functions runParsecT
and mkPT
to deconstruct and reconstruct the parser, bracketing it with adapters to modify the stream type of the incoming and outgoing state:
import Text.Parsec
import Data.Text (Text)
import qualified Data.Text as Text
stringParser :: (Monad m) => ParsecT Text u m a -> ParsecT String u m a
stringParser p = mkPT $ \st -> (fmap . fmap . fmap) outReply $ runParsecT p (inState st)
where inState :: State String u -> State Text u
inState (State i pos u) = State (Text.pack i) pos u
outReply :: Reply Text u a -> Reply String u a
outReply (Ok a (State i pos u) e) = Ok a (State (Text.unpack i) pos u) e
outReply (Error e) = Error e
It seems to work okay:
myTextParser :: Parsec Text () String
myTextParser = (:) <$> oneOf "abc" <*> many letter
myStringParser :: Parsec String () (String, String)
myStringParser = (,) <$> p <* spaces <*> p
where p = stringParser myTextParser
main = do
print =<< parseTest myStringParser "avocado butter"
print =<< parseTest myStringParser "apple error"
giving:
λ> main
("avocado","butter")
()
parse error at (line 1, column 7):
unexpected "e"
expecting space
()
HOWEVER, there are likely to be some serious performance problems here, unless this is being used in a small, toy parser. The pack
calls will take the entire incoming stream and convert it to a Text
value. If you are parsing from a lazy String
(e.g., from a lazy I/O call), the first use of a converted parser will read the entire string into memory as a Text
and pump it back out as a String
; further calls to the same parser will re-pack the remaining stream as Text
each time. Switching to lazy Text
won't really help, since pack
still packs the whole input into the "lazy" Text
value.
You'll need to run some tests/benchmarks to see if this performance hit is acceptable in your application. Generally speaking, rewriting the Text
parser (or seeing if it will compile with an abstract stream type) will be a better approach.
Full code example:
{-# OPTIONS_GHC -Wall #-}
import Text.Parsec
import Data.Text (Text)
import qualified Data.Text as Text
stringParser :: (Monad m) => ParsecT Text u m a -> ParsecT String u m a
stringParser p = mkPT $ \st -> (fmap . fmap . fmap) outReply $ runParsecT p (inState st)
where inState :: State String u -> State Text u
inState (State i pos u) = State (Text.pack i) pos u
outReply :: Reply Text u a -> Reply String u a
outReply (Ok a (State i pos u) e) = Ok a (State (Text.unpack i) pos u) e
outReply (Error e) = Error e
myTextParser :: Parsec Text () String
myTextParser = (:) <$> oneOf "abc" <*> many letter
myStringParser :: Parsec String () (String, String)
myStringParser = (,) <$> p <* spaces <*> p
where p = stringParser myTextParser
main = do
print =<< parseTest myStringParser "avocado butter"
print =<< parseTest myStringParser "apple error"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With