Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Attoparsec hangs

I'm currently solving AOC 4th task where there is following input format, a line of numbers separated by a comma and then 5x5 matrices:

27,14,70,7,85,66,65

31 23 52 26  8
27 89 37 80 46
97 19 63 34 79
13 59 45 12 73
42 25 22  6 39

27 71 24  3  0
79 42 32 72 62
99 52 11 92 33
38 22 16 44 39
35 26 76 49 58

27 71 24 ...

I have the following parser:

{-# LANGUAGE TupleSections #-}

module Lib4Parse (parseAll) where

import Control.Applicative
import Control.Monad
import Data.Attoparsec.ByteString.Char8 (Parser, Result, char, count, decimal, eitherResult, endOfInput, endOfLine, isDigit, isEndOfLine, isSpace, many', many1, option, parse, parseOnly, parseTest, scientific, sepBy, signed, skipMany, skipMany1, skipSpace, skipWhile, space, takeTill)
import qualified Data.ByteString as B

newtype BoardEntry = MkBoardEntry (Bool, Int) deriving (Show, Eq)

newtype Board = MkBoard [[BoardEntry]] deriving (Show, Eq)

parseAll :: B.ByteString -> Maybe ([Int], [Board])
parseAll input =
  either (const Nothing) Just $
    parseOnly ((,) <$> inputNumbers <*> boards) input

inputNumbers :: Parser [Int]
inputNumbers = decimal `sepBy` char ','

boardEntry :: Parser [BoardEntry]
boardEntry = map (\x -> MkBoardEntry (False, x)) <$> (decimal `sepBy` many1 (char ' '))

board :: Parser Board
board =
  MkBoard <$> count 5 (boardEntry <* (skipWhile isSpace <|> endOfInput))

boards :: Parser [Board]
boards = takeTill isDigit >> many board

The problem is that when I use many in the boards function it hangs (when calling the parseAll function). But, in case the boards function looks like this it works:

boards :: Parser [Board]
boards = takeTill isDigit >> count 5 board

However, I want to parse all boards not just specified number of them.

Questions

What is the problem that this hangs, how could this be solved?

How to debug such parsers? - I'm unable to debug this since I don't know what the parser is doing. Any tips and tricks for debugging such parsers?

like image 932
xbalaj Avatar asked Sep 15 '25 06:09

xbalaj


1 Answers

As per @Carl's comment, the problem is that board can match empty input, so many board tries to parse infinitely many empty boards when it hits the end of input. Specifically, a board matches five boardEntrys, but boardEntry can match zero numbers (i.e., an empty string). You could either require that it match five numbers, if all rows should be that long, or you can upgrade your sepBy to sepBy1 to ensure at least one number is matched:

boardEntry :: Parser [BoardEntry]
boardEntry = map (\x -> MkBoardEntry (False, x)) <$> (decimal `sepBy1` many1 (char ' '))

My usual approach for debugging parsers is to run small pieces in isolation:

> :set -XOverloadedStrings
> parseOnly boardEntry "1 2 3 4 5\n"
Right [MkBoardEntry (False,1),...]

In this case, that didn't help that much. I happened to spot the issue in boardEntry before I got too far.

like image 155
K. A. Buhr Avatar answered Sep 17 '25 07:09

K. A. Buhr