I'm searching for some widely extended dialect (like this one https://github.com/vmeurisse/wildmatch + globstar **) described with BFN rules.
In any format or language. OMeta or PEG would be great.
BNF Grammars The input language for sh61command lines is described in terms of a BNF grammar, where BNF stands for Backus–Naur Form, or Backus Normal Form. BNF is a declarative notation for describing a language, which in this context just means a set of strings.
BNF is a declarative notation for describing a language, which in this context just means a set of strings. BNF notation consists of three pieces: Terminals, such as "x", are strings of characters that must exactly match characters in the input.
** is a recursive wildcard which can only be used with paths, not file names. Multiple recursive expressions within the path are not supported. You can have up to 32 nested symbolic links within a path expression.
It’s useful to be able to read BNF grammars because they’re very common in computer science and programming. Many specification documents use BNF to define languages, including programming languages, configuration files, and even network packet formats. These exercises show features of BNF grammars using simple examples.
I'm not sure to understand your question since the grammar for file path wildcard can be reduced to a simple regular expression. This grammar is defined by the Unix Shell.
You can find the BNF for Bash here: http://my.safaribooksonline.com/book/operating-systems-and-server-administration/unix/1565923472/syntax/lbs.appd.div.3
In Python programming language, a definition of the glob.glob()
function is available in the documentation. This function use the fnmatch.fnmatch()
function to perform the pattern matching. The documentation is available here: https://docs.python.org/2/library/fnmatch.html#fnmatch.fnmatch.
The fnmatch.fnmatch
function translate a file path wildcard pattern to a classic regular expression, like this:
def translate(pat):
"""Translate a shell PATTERN to a regular expression.
There is no way to quote meta-characters.
"""
i, n = 0, len(pat)
res = ''
while i < n:
c = pat[i]
i = i+1
if c == '*':
res = res + '.*'
elif c == '?':
res = res + '.'
elif c == '[':
j = i
if j < n and pat[j] == '!':
j = j+1
if j < n and pat[j] == ']':
j = j+1
while j < n and pat[j] != ']':
j = j+1
if j >= n:
res = res + '\\['
else:
stuff = pat[i:j].replace('\\','\\\\')
i = j+1
if stuff[0] == '!':
stuff = '^' + stuff[1:]
elif stuff[0] == '^':
stuff = '\\' + stuff
res = '%s[%s]' % (res, stuff)
else:
res = res + re.escape(c)
return res + '\Z(?ms)'
That can help you to write de BNF grammar...
EDIT
Here is a very simple grammar:
wildcard : expr
| expr wildcard
expr : WORD
| ASTERIX
| QUESTION
| neg_bracket_expr
| pos_bracket_expr
pos_bracket_expr : LBRACKET WORD RBRACKET
neg_bracket_expr : LBRACKET EXCLAMATION WORD RBRACKET
A list of popular grammars parsed by the famous ANTLR tool is available here: http://www.antlr3.org/grammar/list.html.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With