Is there an existing POSIX sh grammar available or do I have to figure it out from the specification directly?
Note I'm not so much interested in a pure sh; an extended but conformant sh is also more than fine for my purposes.
POSIX Shell is a command line shell for computer operating system which was introduced by IEEE Computer Society. POSIX stands for Portable Operating System Interface. POSIX Shell is based on the standard defined in Portable Operating System Interface (POSIX) – IEEE P1003.
Some popular shell languages are POSIX-compliant (Bash, Korn shell), but even they offer additional non-POSIX features which will not always function on other shells. The commands test expression is identical to the command [expression] . In fact, many sources recommend using the brackets for better readability.
Although most commands do the same thing as sh. bash is not a POSIX compliant shell. It is a dialect of the POSIX shell language. Bash can run in a text window and allows the user to interpret commands to do various tasks.
Arrays are not POSIX; except for the arguments array, which is; though getting subset arrays from $@ and $* is not (tip: use set -- to re-purpose the arguments array). Writing for various versions of Bash, though, is pretty do-able.
I've had multiple attempts at writing my own full blown Bash interpreters over the past year, and I've also reached at some point the same book appendix reference stated in the marked answer (#2), but it's not completely correct/updated (for example it doesn't define production rules using the 'coproc' reserved keyword and has a duplicate production rule definition for a redirection using '<&', might be more problems but those are the ones I've noticed).
http://ftp.gnu.org/gnu/bash/
The regex i used was :
(\{(\s+.*?)+\})\s+([;|])
It matches any line non greedily .*?
including spaces and new lines \s+
that are between curly braces, and specifically the last closing brace before a ;
or |
character. Then i just replaced the matched strings to \3
(e.g. the result of the third capturing group, being either ; or |).
Here's the grammar definition that I managed to extract at the time of posting https://pastebin.com/qpsK4TF6
The POSIX standard defines the grammar for the POSIX shell. The definition includes an annotated Yacc grammar. As such, it can be converted to EBNF more or less mechanically.
If you want a 'real' grammar, then you have to look harder. Choose your 'real shell' and find the source and work out what the grammar is from that.
Note that EBNF is not used widely. It is of limited practical value, not least because there are essentially no tools that support it. Therefore, you are unlikely to find an EBNF grammar (of almost anything) off-the-shelf.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With