When using only Optional or ZeroOrMore, pyparsing seems to enter in an infinite loop. The following code work but the part "# Should work with pp.Optional()" should indeed be Optional and not OneOrMore. Should I put some sort of stopOn in this case?
The dictionary is shown below:
In which [expr] means Optional expr, and [expr]... means optional expr that can repeat so ZeroOrMore:
[PINS numPins ;
[ – pinName + NET netName
[+ SPECIAL]
[+ DIRECTION {INPUT | OUTPUT | INOUT | FEEDTHRU}]
[+ NETEXPR "netExprPropName defaultNetName"]
[+ SUPPLYSENSITIVITY powerPinName]
[+ GROUNDSENSITIVITY groundPinName]
[+ USE {SIGNAL | POWER | GROUND | CLOCK | TIEOFF | ANALOG | SCAN | RESET}]
[+ ANTENNAPINPARTIALMETALAREA value [LAYER layerName]] ...
[+ ANTENNAPINPARTIALMETALSIDEAREA value [LAYER layerName]] ...
[+ ANTENNAPINPARTIALCUTAREA value [LAYER layerName]] ...
[+ ANTENNAPINDIFFAREA value [LAYER layerName]] ...
[+ ANTENNAMODEL {OXIDE1 | OXIDE2 | OXIDE3 | OXIDE4}] ...
[+ ANTENNAPINGATEAREA value [LAYER layerName]] ...
[+ ANTENNAPINMAXAREACAR value LAYER layerName] ...
[+ ANTENNAPINMAXSIDEAREACAR value LAYER layerName] ...
[+ ANTENNAPINMAXCUTCAR value LAYER layerName] ...
[ # The code shows only this section
[+ PORT]
[+ LAYER layerName
[MASK maskNum]
[SPACING minSpacing | DESIGNRULEWIDTH effectiveWidth] pt pt
|+ POLYGON layerName
[MASK maskNum]
[SPACING minSpacing | DESIGNRULEWIDTH effectiveWidth] pt pt pt ...
|+ VIA viaName
[MASK viaMaskNum] pt
] ...
[+ COVER pt orient | FIXED pt orient | PLACED pt orient] # This must be Optional
]...
; ] ...
END PINS]
And this is the parser (It shows only the PLACEMENT_PINS part).
# PLACEMENT_PINS
PORT = (ws_pin
+ pp.Keyword('PORT')('PORT')
)
MASK = pp.Group(pp.Keyword('MASK')
+ number('maskNum')
).setResultsName('MASK')
SPACING = pp.Group(pp.Keyword('SPACING')
+ number('minSpacing')
).setResultsName('SPACING')
DESIGNRULEWIDTH = pp.Group(pp.Keyword('DESIGNRULEWIDTH')
+ number('effectiveWidth')
).setResultsName('DESIGNRULEWIDTH')
LAYER = pp.Group(ws_pin
+ pp.Suppress(pp.Keyword('LAYER')) + identifier('layerName')
+ pp.Optional(MASK)
+ pp.Optional(SPACING | DESIGNRULEWIDTH)
+ pp.OneOrMore(pp.Group(pt))('coord')
).setResultsName('LAYER')
POLYGON = pp.Group(ws_pin
+ pp.Suppress(pp.Keyword('POLYGON')) + identifier('layerName')
+ pp.Optional(MASK)
+ pp.Optional(SPACING | DESIGNRULEWIDTH)
+ pp.OneOrMore(pp.Group(pt))('coord')
).setResultsName('POLYGON')
VIA = pp.Group(ws_pin
+ pp.Suppress(pp.Keyword('VIA')) + identifier('viaName')
+ pp.Optional(MASK)
+ pp.Group(pt)('coord')
).setResultsName('VIA')
COVER = pp.Group(ws_pin
+ pp.Keyword('COVER')
+ pp.Group(pt)('coord')
+ ORIENT('orient')
).setResultsName('COVER')
FIXED = pp.Group(ws_pin
+ pp.Keyword('FIXED')
+ pp.Group(pt)('coord')
+ ORIENT('orient')
).setResultsName('FIXED')
PLACED = pp.Group(ws_pin
+ pp.Keyword('PLACED')
+ pp.Group(pt)('coord')
+ ORIENT('orient')
).setResultsName('PLACED')
PLACEMENT_PINS = pp.Group(pp.Optional(PORT)
+ pp.ZeroOrMore(LAYER | POLYGON | VIA)
+ pp.OneOrMore(COVER | FIXED | PLACED) # Should work with pp.Optional(), but it doesn't.
)
pin = pp.Group(pp.Suppress(begin_pin)
+ pinName
+ pp.Optional(SPECIAL)
+ pp.Optional(DIRECTION)
+ pp.Optional(NETEXPR)
+ pp.Optional(SUPPLYSENSITIVITY)
+ pp.Optional(GROUNDSENSITIVITY)
+ pp.Optional(USE)
+ pp.ZeroOrMore(ANTENNAPINPARTIALMETALAREA)
+ pp.ZeroOrMore(ANTENNAPINPARTIALMETALSIDEAREA)
+ pp.ZeroOrMore(ANTENNAPINPARTIALCUTAREA)
+ pp.ZeroOrMore(ANTENNAPINDIFFAREA)
+ pp.ZeroOrMore(ANTENNAMODEL)
+ pp.ZeroOrMore(ANTENNAPINGATEAREA)
+ pp.ZeroOrMore(ANTENNAPINMAXAREACAR)
+ pp.ZeroOrMore(ANTENNAPINMAXSIDEAREACAR)
+ pp.ZeroOrMore(ANTENNAPINMAXCUTCAR)
+ pp.ZeroOrMore(PLACEMENT_PINS).setResultsName('PLACEMENT')
+ pp.Suppress(linebreak)
).setResultsName('pin', listAllMatches=True)
pins = pp.Group(pp.Suppress(pins_id) + number('numPins') + pp.Suppress(linebreak)
+ pp.ZeroOrMore(pin)
+ pp.Suppress(end_pins_id)
).setResultsName('PINS')
And here is an example of the text to be parsed:
PINS 165 ;
- clk + NET clk + DIRECTION INPUT + USE SIGNAL
+ LAYER M2 ( -25 0 ) ( 25 220 )
+ PLACED ( 0 81500 ) E ;
- rst + NET rst + DIRECTION INPUT + USE SIGNAL
+ LAYER M5 ( -25 0 ) ( 25 220 )
+ PLACED ( 96300 140000 ) S ;
- im_rsc_CSN + NET im_rsc_CSN + DIRECTION OUTPUT + USE SIGNAL
+ LAYER M3 ( -25 0 ) ( 25 220 )
+ PLACED ( 80300 140000 ) S ;
END PINS
In this example, if the lines "+ PLACED" are removed the parser doesn't work since it's "pp.OneOrMore(COVER | FIXED | PLACED)" and not "pp.Optional(COVER | FIXED | PLACED)".
Other section to be parsed is UNITS. All expressions are optional, i.e. the file can contain "TIME NANOSECONDS 1000" or not etc.
[UNITS
[TIME NANOSECONDS convertFactor ;]
[CAPACITANCE PICOFARADS convertFactor ;]
[RESISTANCE OHMS convertFactor ;]
[POWER MILLIWATTS convertFactor ;]
[CURRENT MILLIAMPS convertFactor ;]
[VOLTAGE VOLTS convertFactor ;]
[DATABASE MICRONS LEFconvertFactor ;]
[FREQUENCY MEGAHERTZ convertFactor ;]
END UNITS]
Here is the parser that hangs because all expressions are optional:
# DATABASE_MICRONS
DATABASE_MICRONS = (pp.Keyword('DATABASE MICRONS')
+ number('convertFactor')
+ linebreak
)
unit = pp.Group(pp.Optional(TIME_NANOSECONDS)
+ pp.Optional(CAPACITANCE_PICOFARADS)
+ pp.Optional(RESISTANCE_OHMS)
+ pp.Optional(POWER_MILLIWATTS)
+ pp.Optional(CURRENT_MILLIAMPS)
+ pp.Optional(VOLTAGE_VOLTS)
+ pp.Optional(DATABASE_MICRONS)
+ pp.Optional(FREQUENCY_MEGAHERTZ)
).setResultsName('unit', listAllMatches=True)
units = pp.Group(pp.Suppress(units_id)
+ pp.OneOrMore(unit)
+ pp.Suppress(end_units_id)
).setResultsName('UNITS')
However, if I replace one of the lines, for example "+ pp.Optional(DATABASE_MICRONS)" by "+ pp.OneOrMore(DATABASE_MICRONS)" (then the file must now contain this expression) then it will work.
Example of UNITS section:
UNITS
DATABASE MICRONS 1000 ;
END UNITS
So, how to deal with grammars in which all expressions are optional?
If all the elements in PLACEMENT_PINS
are optional, then it will match the empty string. Matching ZeroOrMore
of an expression that will match the empty string will loop forever.
Are all the ZeroOrMore's there because you don't know what the order will be? If so, consider using the '&' operator instead of '+'. a_expr & b_expr & c_expr
will match the three expressions but in any order.
EDIT:
I understand that they are all optional, but because you have lumped them together into their own unit
expression with everything Optional
(and so matchable to the empty string)
and are then OneOrMore
ing them, this is another endless loop.
When you say "they are all optional", I understand that they are all optional from the standpoint of defining a UNITS
section. But the OneOrMore
in units
is already
taking care of repetition. If an empty UNITS
section is valid, then use ZeroOrMore
.
These look like 'unit phrase's to me, that each is some multi-word qualifier on units, any or all of which might be present, in any number.
Instead of adding them all as Optionals, define them as a single MatchFirst - "a unit phrase is one of the specific phrases".
The outer OneOrMore
will take care of the repetition and optionalizing:
unit_phrase = pp.Group(TIME_NANOSECONDS
| CAPACITANCE_PICOFARADS
| RESISTANCE_OHMS
| POWER_MILLIWATTS
| CURRENT_MILLIAMPS
| VOLTAGE_VOLTS
| DATABASE_MICRONS
| FREQUENCY_MEGAHERTZ)
units = pp.Group(pp.Suppress(units_id)
+ pp.OneOrMore(unit_phrase)('unit')
+ pp.Suppress(end_units_id)
).setResultsName('UNITS')
If in fact these can all be optional but must occur only once, then defining an Each
of Optional
s is what you want, with no repetition:
unit = pp.Group(pp.Optional(TIME_NANOSECONDS)
& pp.Optional(CAPACITANCE_PICOFARADS)
& pp.Optional(RESISTANCE_OHMS)
& pp.Optional(POWER_MILLIWATTS)
& pp.Optional(CURRENT_MILLIAMPS)
& pp.Optional(VOLTAGE_VOLTS)
& pp.Optional(DATABASE_MICRONS)
& pp.Optional(FREQUENCY_MEGAHERTZ)
)
units = pp.Group(pp.Suppress(units_id)
+ unit.setResultsName('unit') # <-- no OneOrMore repetition now, let Each do the orderless matching
+ pp.Suppress(end_units_id)
).setResultsName('UNITS')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With