I have recently started learning Haskell and have been trying my hand at Parsec. However, for the past couple of days I have been stuck with a problem that I have been unable to find the solution to. So what I am trying to do is write a parser that can parse a string like this:
<"apple", "pear", "pineapple", "orange">
The code that I wrote to do that is:
collection :: Parser [String]
collection = (char '<') *> (string `sepBy` char ',')) <* (char '>')
string :: Parser String
string = char '"' *> (many (noneOf ['\"', '\r', '\n', '"'])) <* char '"'
This works fine for me as it is able to parse the string that I have defined above. Nevertheless, I would now like to enforce the rule that every element in this collection must be unique and that is where I am having trouble. One of the first results I found when searching on the internet was this one, which suggest the usage of the nub
function. Although the problem stated in that question is not the same, it would in theory solve my problem. But what I don't understand is how I can apply this function within a Parser. I have tried adding the nub function to several parts of the code above without any success. Later I also tried doing it the following way:
collection :: Parser [String]
collection = do
char '<'
value <- (string `sepBy` char ','))
char '>'
return nub value
But this does not work as the type does not match what nub
is expecting, which I believe is one of the problems I am struggling with. I am also not entirely sure whether nub
is the right way to go. My fear is that I am going in the wrong direction and that I won't be able to solve my problem like this. Is there perhaps something I am missing? Any advice or help anyone could provide would be greatly appreciated.
The Parsec Parser
type is an instance of MonadPlus
which means that we can always fail (ie cause a parse error) whenever we want. A handy function for this is guard
:
guard :: MonadPlus m => Bool -> m ()
This function takes a boolean. If it's true, it return ()
and the whole computation (a parse in this case) does not fail. If it's false, the whole thing fails.
So, as long as you don't care about efficiency, here's a reasonable approach: parse the whole list, check for whether all the elements are unique and fail if they aren't.
To do this, the first thing we have to do is write a predicate that checks if every element of a list is unique. nub
does not quite do the right thing: it return a list with all the duplicates taken out. But if we don't care much about performance, we can use it to check:
allUnique ls = length (nub ls) == length ls
With this predicate in hand, we can write a function unique
that wraps any parser that produces a list and ensures that list is unique:
unique parser = do res <- parser
guard (allUnique res)
return res
Again, if guard
is give True
, it doesn't affect the rest of the parse. But if it's given False
, it will cause an error.
Here's how we could use it:
λ> parse (unique collection) "<interactive>" "<\"apple\",\"pear\",\"pineapple\",\"orange\">"
Right ["apple","pear","pineapple","orange"]
λ> parse (unique collection) "<interactive>" "<\"apple\",\"pear\",\"pineapple\",\"orange\",\"apple\">"
Left "<interactive>" (line 1, column 46):unknown parse error
This does what you want. However, there's a problem: there is no error message supplied. That's not very user friendly! Happily, we can fix this using <?>
. This is an operator provided by Parsec that lets us set the error message of a parser.
unique parser = do res <- parser
guard (allUnique res) <?> "unique elements"
return res
Ahhh, much better:
λ> parse (unique collection) "<interactive>" "<\"apple\",\"pear\",\"pineapple\",\"orange\",\"apple\">"
Left "<interactive>" (line 1, column 46):
expecting unique elements
All this works but, again, it's worth noting that it isn't efficient. It parses the whole list before realizing elements aren't unique, and nub
takes quadratic time. However, this works and it's probably more than good enough for parsing small to medium-sized files: ie most things written by hand rather than autogenerated.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With