Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Priority in grammar using Lark

I have a priority problem in my grammar, and I don't have any more idea to fix it.

I'm using Lark

Here is the thing (I have simplified the problem as much as I can):

from lark import Lark

parser = Lark(r"""
    start: set | set_mul

    set_mul: [nb] set
    set: [nb] "foo"
    nb: INT "x"

   %import common.INT
   %import common.WS
   %ignore WS

   """, start='start')

input = "3xfoo"
p = parser.parse(input)
print(p.pretty())

The output is :

  start
  set_mul
    set
      nb    3

But what I want is :

start
  set_mul
     nb 3
     set

I tried to put priority in my rules, but it's not working.

Do you have any idea of what I would need to change to make it work ?

Thanks

like image 302
Kypaz Avatar asked Apr 05 '18 11:04

Kypaz


2 Answers

A simple solution might be to re-write your grammar to remove the ambiguity.

parser = Lark(r"""
    start: set | set_mul

    set_mul: nb | nb set | nb nb_set
    set: "foo"
    nb_set: nb set
    nb: INT "x"

   %import common.INT
   %import common.WS
   %ignore WS

   """, start='start')

This way, each of the following inputs has only one possible interpretation:

input = "3xfoo"
p = parser.parse(input)
print(p.pretty())

input = "3x4xfoo"
p = parser.parse(input)
print(p.pretty())         

Result:

start
  set_mul
    nb  3
    set

start
  set_mul
    nb  3
    nb_set
      nb    4
      set
like image 113
Erez Avatar answered Sep 21 '22 19:09

Erez


This is not a full answer, but gets you part way I hope. Your problem is that your grammar is ambiguous and the example you use hits that ambiguity head-on. Lark chooses to disambiguate for you, and you get the result you. see.

Make Lark not disambiguate, like this by adding ambiguity='explicit':

import lark

parser = lark.Lark(r"""
    start: set | set_mul

    set_mul: [nb] set
    set: [nb] "foo"
    nb: INT "x"

   %import common.INT
   %import common.WS
   %ignore WS

   """, start='start',ambiguity='explicit')

input = "3xfoo"
p = parser.parse(input)
print(p.pretty())

and you get this output which includes the one you want:

_ambig
  start
    set
      nb        3
  start
    set_mul
      set
        nb      3
  start
    set_mul
      nb        3
      set

How can you encourage Lark to disambiguate to your preferred out? Good question.

like image 34
DisappointedByUnaccountableMod Avatar answered Sep 24 '22 19:09

DisappointedByUnaccountableMod