The difference between Chomsky type 3 and Chomsky type 2 grammar

2 Answers

A Type II grammar is a Type III grammar with a stack

A Type II grammar is basically a Type III grammar with nesting.

Type III grammar (Regular):

Use Case - CSV (Comma Separated Values)

Characteristics:

can be read with a using a FSM (Finite State Machine)
requires no intermediate storage
can be read with Regular Expressions
usually expressed using a 1D or 2D data structure
is flat, meaning no nesting or recursive properties

Ex:

this,is,,"an "" example",\r\n
"of, a",type,"III\n",grammar\r\n

As long as you can figure out all of the rules and edge cases for the above text you can parse CSV.

Type II grammar (Context Free):

Use Case - HTML (Hyper Text Markup Language) or SGML in general

Characteristics:

can be read using a DPDA (Deterministic Pushdown Automata)
will require a stack for intermediate storage
may be expressed as an AST (Abstract Syntax Tree)
may contain nesting and/or recursive properties

HTML could be expressed as a regular grammar:

<h1>Useless Example</h1>
<p>Some stuff written here</p>
<p>Isn't this fun</p>

But it's try parsing this using a FSM:

<body>
  <div id=titlebar>
    <h1>XHTML 1.0</h1>
    <h2>W3C's failed attempt to enforce HTML as a context-free language</h2>
  </div>
  <p>Back when the web was still pretty boring, the W3C attempted to standardize away the quirkiness of HTML by introducing a strict specification</p
  <p>Unfortunately, everybody ignored it.</p>
</body>

See the difference? Imagine you were writing a parser, you could start on an open tag and finish on a closing tag but what happens when you encounter a second opening tag before reaching the closing tag?

It's simple, you push the first opening tag onto a stack and start parsing the second tag. Repeat this process for as many levels of nesting that exist and if the syntax is well-structured, the stack can be un-rolled one layer at a time in the opposite level that it was built

Due to the strict nature of 'pure' context-free languages, they're relatively rare unless they're generated by a program. JSON, is a prime example.

The benefit of context-free languages is that, while very expressive, they're still relatively simple to parse.

But wait, didn't I just say HTML is context-free. Yep, if it is well-formed (ie XHTML).

While XHTML may be considered context-free, the looser-defined HTML would actually considered Type I (Ie Context Sensitive). The reason being, when the parser reaches poorly structured code it actually makes decisions about how to interpret the code based on the surrounding context. For example if an element is missing its closing tags, it would need to determine where that element exists in the hierarchy before it can decide where the closing tag should be placed.

Other features that could make a context-free language context-sensitive include, templates, imports, preprocessors, macros, etc.

In short, context-sensitive languages look a lot like context-free languages but the elements of a context-sensitive languages may be interpreted in different ways depending on the program state.

Disclaimer: I am not formally trained in CompSci so this answer may contain errors or assumptions. If you asked me the difference between a terminal and a non-terminal you'll earn yourself a blank stare. I learned this much by actually building a Type III (Regular) parser and by reading extensively about the rest.

189

answered Oct 21 '22 17:10

Evan Plaice

The wikipedia page has a good picture and bullet points.

Roughly, the underlying machine that can describe a regular language does not need memory. It runs as a statemachine (DFA/NFA) on the input. Regular languages can also be expressed with regular expressions.

A language with the "next" level of complexity added to it is a context free language. The underlying machine describing this kind of language will need some memory to be able to represent the languages that are context free and not regular. Note that adding memory to your machine makes it a little more powerful, so it can still express languages (e.g. regular languages) that didn't need the memory to begin with. The underlying machine is typically a push-down automaton.

answered Oct 21 '22 16:10

Derek E

Related questions
                            
                                Generative regular expressions
                            
                                Is a*b* regular?
                            
                                How does "δ:Q×Σ→Q" read in the definition of a DFA (deterministic finite automaton)?
                            
                                Ignore everything in a directory except one subfolder
                            
                                substring match faster with regular expression?
                            
                                Can a regular expression itself be parsed with a regular expression? [duplicate]
                            
                                Which programming languages have a regular grammar?
                            
                                Regular languages vs. non-regular ones [closed]
                            
                                Combining deterministic finite automata
                            
                                Need Regular Expression for Finite Automata: Even number of 1s and Even number of 0s
                            
                                If we know a CFG only generates regular language, can we get the corresponding regular expression?
                            
                                No \p{L} for JavaScript Regex ? Use Unicode in JS regex [duplicate]
                            
                                What is the power of regular expressions?
                            
                                How should one proceed to prove (or find) if two regular expressions are same or equivalent?
                            
                                To make sure: Pumping lemma for infinite regular languages only?
                            
                                An infinite language can't be regular? What is a finite language?
                            
                                Minimum pumping length for the following regular languages
                            
                                Is L = {a^n b^m | n>m} a regular or irregular language?
                            
                                Is there a way to negate a regular expression?
                            
                                Why is {a^nb^n | n >= 0} not regular?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

The difference between Chomsky type 3 and Chomsky type 2 grammar

Tags:

regular-language

chomsky-hierarchy

context-free-language

Jay Jenkins

People also ask

2 Answers

Evan Plaice

Derek E

Recent Activity

Donate For Us