Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to parse arbitrary lists with Haskell parsers?

Is it possible to use one of the parsing libraries (e.g. Parsec) for parsing something different than a String? And how would I do this?

For the sake of simplicity, let's assume the input is a list of ints [Int]. The task could be

  • drop leading zeros
  • parse the rest into the pattern (S+L+)*, where S is a number less than 10, and L is a number larger or equal to ten.
  • return a list of tuples (Int,Int), where fst is the product of the S and snd is the product of the L integers

It would be great if someone could show how to write such a parser (or something similar).

like image 539
ziggystar Avatar asked Jan 10 '23 09:01

ziggystar


1 Answers

Yes, as user5402 points out, Parsec can parse any instance of Stream, including arbitrary lists. As there are no predefined token parsers (as there are for text) you have to roll your own, (myToken below) using e.g. tokenPrim

The only thing I find a bit awkward is the handling of "source positions". SourcePos is an abstract type (rather than a type class) and forces me to use its "filename/line/column" format, which feels a bit unnatural here.

Anyway, here is the code (without the skipping of leading zeroes, for brevity)

import Text.Parsec

myToken ::  (Show a) => (a -> Bool) -> Parsec [a] () a
myToken test = tokenPrim show incPos $ justIf test where
  incPos pos _ _ = incSourceColumn pos 1
  justIf test x = if (test x) then Just x else Nothing

small = myToken  (< 10)
large = myToken  (>= 10)

smallLargePattern = do
  smallints <- many1 small
  largeints <- many1 large
  let prod = foldl1 (*)
  return (prod smallints, prod largeints)

myIntListParser :: Parsec [Int] () [(Int,Int)]
myIntListParser = many smallLargePattern

testMe :: [Int] -> [(Int, Int)]
testMe xs = case parse myIntListParser "your list" xs of
  Left err -> error $ show err
  Right result -> result

Trying it all out:

*Main> testMe [1,2,55,33,3,5,99]
[(2,1815),(15,99)]
*Main> testMe [1,2,55,33,3,5,99,1]
*** Exception: "your list" (line 1, column 9):
unexpected end of input

Note the awkward line/column format in the error message

Of course one could write a function sanitiseSourcePos :: SourcePos -> MyListPosition

like image 185
Hans Lub Avatar answered Jan 18 '23 22:01

Hans Lub