I'm trying to parse a string that can contain escaped characters, here's an example:
import qualified Data.Text as T
exampleParser :: Parser T.Text
exampleParser = T.pack <$> many (char '\\' *> escaped <|> anyChar)
where escaped = satisfy (\c -> c `elem` ['\\', '"', '[', ']'])
The parser above creates a String
and then packs it into Text
. Is there any way to parse a string with escapes like the above using the functions for efficient string handling that attoparsec provides? Like string
, scan
, runScanner
, takeWhile
, ...
Parsing something like "one \"two\" \[three\]"
would produce one "two" [three]
.
Update:
Thanks to @epsilonhalbe I was able to come out with a generalized solution perfect for my needs; note that the following function doesn't look for matching escaped characters like [..]
, ".."
, (..)
, etc; and also, if it finds an escaped character that is not valid it treats \
as a literal character.
takeEscapedWhile :: (Char -> Bool) -> (Char -> Bool) -> Parser Text
takeEscapedWhile isEscapable while = do
x <- normal
xs <- many escaped
return $ T.concat (x:xs)
where normal = Atto.takeWhile (\c -> c /= '\\' && while c)
escaped = do
x <- (char '\\' *> satisfy isEscapable) <|> char '\\'
xs <- normal
return $ T.cons x xs
It is possible writing some escaping code, attoparsec
and text
- altogether it is pretty straightforward - seeing you have already worked with parsers
import Data.Attoparsec.Text as AT
import qualified Data.Text as T
import Data.Text (Text)
escaped, quoted, brackted :: Parser Text
normal = AT.takeWhile (/= '\\')
escaped = do r <- normal
rs <- many escaped'
return $ T.concat $ r:rs
where escaped' = do r1 <- normal
r2 <- quoted <|> brackted
return $ r1 <> r2
quoted = do string "\\\""
res <- normal
string "\\\""
return $ "\""<>res <>"\""
brackted = do string "\\["
res <- normal
string "\\]"
return $ "["<>res<>"]"
then you can use it to parse the following test cases
Prelude >: MyModule
Prelude MyModule> import Data.Attoparsec.Text as AT
Prelude MyModule AT> import Data.Text.IO as TIO
Prelude MyModule AT TIO>:set -XOverloadedStrings
Prelude MyModule AT TIO> TIO.putStrLn $ parseOnly escaped "test"
test
Prelude MyModule AT TIO> TIO.putStrLn $ parseOnly escaped "\\\"test\\\""
"test"
Prelude MyModule AT TIO> TIO.putStrLn $ parseOnly escaped "\\[test\\]"
[test]
Prelude MyModule AT TIO> TIO.putStrLn $ parseOnly escaped "test \\\"test\\\" \\[test\\]"
test "test" [test]
note you have to escape the escapes - that's why you see \\\"
instead of \"
Also if you just parse it will print the Text
values escaped, like
Right "test \"text\" [test]"
for the last example.
If you parse a file you write simpley escaped text in the file.
test.txt
I \[like\] \"Haskell\"
then you can
Prelude MyModule AT TIO> file <- TIO.readFile "test.txt"
Prelude MyModule AT TIO> TIO.putStrLn $ parseOnly escaped file
I [like] "Haskell"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With