At first glance, the shunting yard algorithm seems applicable to POSIX regular expression parsing, but since I don't have much experience (or theoretical background) in writing parsers, I'd like to ask SO before jumping in and writing something only to get stuck halfway.
Perhaps a more sophisticated version of the question is: What is a good formal statement of the class of problems the shunting yard algorithm can be applied to?
Clarification: This question is about whether you can parse POSIX re syntax into an abstract syntax tree using the basic principles of the shunting algorithm, not whether you can use regular expressions to implement the shunting algorithm. Sorry I wasn't clear enough stating that to begin with!
I'm fairly sure it can. If you look at Henry Spencer's regular expression package:
regexp.shar.Z
which was the basis for Perl's regular expressions, you will notice that he describes the program as being in "railroad normal form".
I reckon you'd have some problems because different characters have different meanings in different contexts e.g.
^[^a-z][asd-]
The ^
has two different meanings and so does the -
. I think I'd choose a recursive descent parser.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With