Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Grammar.parse seems to loop forever and use 100% CPU

Tags:

grammar

raku

Reposted from the #perl6 IRC channel, by jkramer, with permission

I'm playing with grammars and trying to parse an ini-style file but somehow Grammar.parse seems to loop forever and use 100% CPU. Any ideas what's wrong here?

grammar Format {
  token TOP {
    [
      <comment>*
      [
        <section>
        [ <line> | <comment> ]*
      ]*
    ]*
  }

  rule section {
    '[' <identifier> <subsection>? ']'
  }

  rule subsection {
    '"' <identifier> '"'
  }

  rule identifier {
    <[A..Za..z]> <[A..Za..z0..9_-]>+
  }

  rule comment {
    <[";]> .*? $$
  }

  rule line {
    <key> '=' <value>
  }

  rule key {
    <identifier>
  }

  rule value {
    .*? $$
  }
}

Format.parse('lol.conf'.IO.slurp)
like image 817
jjmerelo Avatar asked Apr 12 '18 15:04

jjmerelo


1 Answers

Token TOP has the * quantifier on a subregex that can parse an empty string (because both <comment> and the group that contains <section> have a * quantifier on their own).

If the inner subregex matches the empty string, it can do so infinitely many times without advancing the cursor. Currently, Perl 6 has no protection against this kind of error.

It looks to me like you could simplify your code to

token TOP {
  <comment>*
  [
    <section>
    [ <line> | <comment> ]*
  ]*
}

(there is no need for the outer group of [...]*, because the last <comment> also matches comments before sections.

like image 64
moritz Avatar answered Nov 15 '22 09:11

moritz