I have some code here that works for parsing URI paths into a list of Strings. For examples /user/home would become ["user", "home"].
pathPiece :: Parser String
pathPiece = do
char '/'
path <- many1 urlBaseChar
return path
uriPath :: Parser [String]
uriPath = do
pieces <- many pathPiece
try $ char '/'
return pieces
parseUriPath :: String -> [String]
parseUriPath input = case parse uriPath "(unknown)" input of
Left _ -> []
Right xs -> xs
However, there if the path ends with another / such as /user/home/, which should be a legitimate path, the parser will fail. This is because pathPiece fails to parse the last / since there are no following urlBaseChars. I am wondering how you parse with many until it fails, and if it fails you undo character consumption.
Try this:
pathPiece :: Parser String
pathPiece = try $ do
char '/'
many1 urlBaseChar
uriPath :: Parser [String]
uriPath = do
pieces <- many pathPiece
optional (char '/')
return pieces
You need to add a try to pathPiece. Otherwise, parsing the final / will make Parsec
think that a new pathPiece has started, and without try, there's no backtracking. Also,
unless you actually want to require a final /, you need to make it optional. The
function try does not do that.
I think you can use many1 urlBaseChar `sepEndBy` char '/' here. See sepEndBy in Parsec.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With