Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

BNF grammar definition for file path wildcard (glob)

Tags:

glob

bnf

peg

ometa

I'm searching for some widely extended dialect (like this one https://github.com/vmeurisse/wildmatch + globstar **) described with BFN rules.

In any format or language. OMeta or PEG would be great.

like image 322
Ev_genus Avatar asked Sep 07 '14 14:09

Ev_genus


People also ask

What is BNF grammar in SH61?

BNF Grammars The input language for sh61command lines is described in terms of a BNF grammar, where BNF stands for Backus–Naur Form, or Backus Normal Form. BNF is a declarative notation for describing a language, which in this context just means a set of strings.

What is BNF notation in programming?

BNF is a declarative notation for describing a language, which in this context just means a set of strings. BNF notation consists of three pieces: Terminals, such as "x", are strings of characters that must exactly match characters in the input.

What is a **** wildcard?

** is a recursive wildcard which can only be used with paths, not file names. Multiple recursive expressions within the path are not supported. You can have up to 32 nested symbolic links within a path expression.

Why is it important to read BNF grammars?

It’s useful to be able to read BNF grammars because they’re very common in computer science and programming. Many specification documents use BNF to define languages, including programming languages, configuration files, and even network packet formats. These exercises show features of BNF grammars using simple examples.


1 Answers

I'm not sure to understand your question since the grammar for file path wildcard can be reduced to a simple regular expression. This grammar is defined by the Unix Shell.

You can find the BNF for Bash here: http://my.safaribooksonline.com/book/operating-systems-and-server-administration/unix/1565923472/syntax/lbs.appd.div.3

In Python programming language, a definition of the glob.glob() function is available in the documentation. This function use the fnmatch.fnmatch() function to perform the pattern matching. The documentation is available here: https://docs.python.org/2/library/fnmatch.html#fnmatch.fnmatch.

The fnmatch.fnmatch function translate a file path wildcard pattern to a classic regular expression, like this:

def translate(pat):
    """Translate a shell PATTERN to a regular expression.

    There is no way to quote meta-characters.
    """

    i, n = 0, len(pat)
    res = ''
    while i < n:
        c = pat[i]
        i = i+1
        if c == '*':
            res = res + '.*'
        elif c == '?':
            res = res + '.'
        elif c == '[':
            j = i
            if j < n and pat[j] == '!':
                j = j+1
            if j < n and pat[j] == ']':
                j = j+1
            while j < n and pat[j] != ']':
                j = j+1
            if j >= n:
                res = res + '\\['
            else:
                stuff = pat[i:j].replace('\\','\\\\')
                i = j+1
                if stuff[0] == '!':
                    stuff = '^' + stuff[1:]
                elif stuff[0] == '^':
                    stuff = '\\' + stuff
                res = '%s[%s]' % (res, stuff)
        else:
            res = res + re.escape(c)
    return res + '\Z(?ms)'

That can help you to write de BNF grammar...

EDIT

Here is a very simple grammar:

wildcard : expr
         | expr wildcard

expr : WORD
     | ASTERIX
     | QUESTION
     | neg_bracket_expr
     | pos_bracket_expr

pos_bracket_expr : LBRACKET WORD RBRACKET

neg_bracket_expr : LBRACKET EXCLAMATION WORD RBRACKET

A list of popular grammars parsed by the famous ANTLR tool is available here: http://www.antlr3.org/grammar/list.html.

like image 65
Laurent LAPORTE Avatar answered Sep 30 '22 00:09

Laurent LAPORTE