Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using Parsec with Data.Text

Tags:

haskell

parsec

Using Parsec 3.1, it is possible to parse several types of inputs:

  • [Char] with Text.Parsec.String
  • Data.ByteString with Text.Parsec.ByteString
  • Data.ByteString.Lazy with Text.Parsec.ByteString.Lazy

I don't see anything for the Data.Text module. I want to parse Unicode content without suffering from the String inefficiencies. So I've created the following module based on the Text.Parsec.ByteString module:

{-# LANGUAGE FlexibleInstances, MultiParamTypeClasses #-} {-# OPTIONS_GHC -fno-warn-orphans #-}  module Text.Parsec.Text     ( Parser, GenParser     ) where  import Text.Parsec.Prim  import qualified Data.Text as T  instance (Monad m) => Stream T.Text m Char where     uncons = return . T.uncons  type Parser = Parsec T.Text () type GenParser t st = Parsec T.Text st 
  1. Does it make sense to do so?
  2. It this compatible with the rest of the Parsec API?

Additional comments:

I had to add {-# LANGUAGE NoMonomorphismRestriction #-} pragma in my parse modules to make it work.

Parsing Text is one thing, building an AST with Text is another thing. I will also need to pack my String before return:

module TestText where  import Data.Text as T  import Text.Parsec import Text.Parsec.Prim import Text.Parsec.Text  input = T.pack "xxxxxxxxxxxxxxyyyyxxxxxxxxxp"  parser = do   x1 <- many1 (char 'x')   y <- many1 (char 'y')   x2 <- many1 (char 'x')   return (T.pack x1, T.pack y, T.pack x2)  test = runParser parser () "test" input 
like image 307
gawi Avatar asked Oct 31 '10 18:10

gawi


2 Answers

Since Parsec 3.1.2 support of Data.Text is built-in! See http://hackage.haskell.org/package/parsec-3.1.2

If you are stuck with older version, the code snippets in other answers are helpful, too.

like image 116
Zouppen Avatar answered Oct 10 '22 11:10

Zouppen


That looks like exactly what you need to do.

It should be compatible with the rest of Parsec, include the Parsec.Char parsers.

If you're using Cabal to build your program, please put an upper bound of parsec-3.1 in your package description, in case the maintainer decides to include that instance in a future version of Parsec.

like image 26
Antoine Latter Avatar answered Oct 10 '22 11:10

Antoine Latter