Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

If we know a CFG only generates regular language, can we get the corresponding regular expression?

As we know, given a regular grammar, we have algorithm to get its regular expression.

But if the given grammar is context-free grammar (but it only generates regular language), like

  • S->aAb
  • A->bB
  • B->cB|d
  • Is there any existing algorithm that can get the regular expression in general?

    Thanks!

    like image 886
    JackWM Avatar asked May 16 '12 02:05

    JackWM


    People also ask

    Can you convert a CFG to a regular expression?

    The set of all regular languages is a subset of context free languages. So if you have a context free grammar (CFG) that generates a regular languages, you most certainly can convert it to a regular expression (RE), regular grammar (RG), or finite automata (FA).

    Can all regular languages by generated by CFG?

    Context-free grammars (CFGs) are used to describe context-free languages. A context-free grammar is a set of recursive rules used to generate patterns of strings. A context-free grammar can describe all regular languages and more, but they cannot describe all possible languages.

    How do I get regular language from regular expression?

    Regular Expressions describe exactly the regular languages. If E is a regular expression, then L(E) is the regular language it defines. For each regular expression E, we can create a DFA A such that L(E) = L(A).


    1 Answers

    In the most general sense, there is no solution. The problem of determining whether a CFG is regular is undecidable (Greibach Theorem, last 3 pages of http://www.cis.upenn.edu/~jean/gbooks/PCPh04.pdf ) If we could convert CFGs to Regular Expressions, we could use that algorithm on any grammar and use its success/failure to determine whether the language is regular.

    So instead, when a CFG is known to produce a regular language, either its language is already known (and therefore directly convertible to a RegEx), or there's some property of the grammar to exploit. Each property has its own algorithm for converting to a RegEx.

    For example, if the grammar is right linear, every production is of the form A->bC or A->a. This can be converted to a NFA where:

    1) There is a state for every non-terminal, plus an accept state.

    2) The start symbol S is the start state.

    3) A->bC is a transition from A to B on input b

    4) A->a is a transition from A to the accept state on input a.

    This NFA can then be converted to a regular expression via state elimination (pages 5-8 of http://www.math.uaa.alaska.edu/~afkjm/cs351/handouts/regular-expressions.pdf ). An analogous process for left-linear grammars would have start and accept states exchanged.

    Beyond that, one could exploit closure properties of regular languages. For example, the language in the question is not linear, but it can be written as S->S'b, S'->aA. Now S' is right-linear, and S is the concatenation of two disjoint linear grammars. Concatenate the two expressions for the final expression. Similar logic for union.

    like image 101
    AlwaysBTryin Avatar answered Sep 21 '22 15:09

    AlwaysBTryin