I am reading Rebol Wikipedia page.
"Parse expressions are written in the parse dialect, which, like the do dialect, is an expression-oriented sublanguage of the data exchange dialect. Unlike the do dialect, the parse dialect uses keywords representing operators and the most important nonterminals"
Can you explain what are terminals and nonterminals? I have read a lot about grammars, but did not understand what they mean. Here is another link where this words are used very often.
Terminal symbols are the elementary symbols of the language defined by a formal grammar. Nonterminal symbols (or syntactic variables) are replaced by groups of terminal symbols according to the production rules. The terminals and nonterminals of a particular grammar are two disjoint sets.
Terminal Symbol indicates the beginning or end of a flowchart. This symbol usually has the text “Start” or “End”.
Terminals are the basic symbols from which strings are formed. A set of productions (P). The productions of a grammar specify the manner in which the terminals and non-terminals can be combined to form strings.
A CFG consists of the following components: a set of terminal symbols, which are the characters of the alphabet that appear in the strings generated by the grammar. a set of nonterminal symbols, which are placeholders for patterns of terminal symbols that can be generated by the nonterminal symbols.
Definitions of terminal and non-terminal symbols are not Parse-specific, but are concerned with grammars in general. Things like this wiki page or intro in Grune's book explain them quite well. OTOH, if you're interested in how Red Parse works and yearn for simple examples and guidance, I suggest to drop by our dedicated chat room.
"parsing" has slightly different meanings, but the one I prefer is conversion of linear structure (string of symbols, in a broad sense) to a hierarchical structure (derivation tree) via a formal recipe (grammar), or checking if a given string has a tree-like structure specified by a grammar (i.e. if "string" belongs to a "language").
All symbols in a string are terminals, in a sense that tree derivation "terminates" on them (i.e. they are leaves in a tree). Non-terminals, in turn, are a form of abstraction that is used in grammar rules - they group terminals and non-terminals together (i.e. they are nodes in a tree).
For example, in the following Parse grammar:
greeting: ['hi | 'hello | 'howdy]
person: [name surname]
name: ['john | 'jane]
surname: ['doe | 'smith]
sentence: [greeting person]
greeting
, person
, name
, surname
and sentence
are non-terminals (because they never actually appear in the linear input sequence, only in grammar rules);hi
, hello
, howdy
with john
, jane
, doe
and smith
are terminals (because parser cannot "expand" them into a set of terminals and non-terminals as it does with non-terminals, hence it "terminates" by reaching the bottom).>> parse [hi jane doe] sentence
== true
>> parse [howdy john smith] sentence
== true
>> parse [wazzup bubba ?] sentence
== false
As you can see, terminal and non-terminal are disjoint sets, i.e. a symbol can be either in one of them, but not in both; moreso, inside grammar rules, only non-terminals can be written on the left side.
One grammar can match different strings, and one string can be matched by different grammars (in the example above, it could be [greeting name surname]
, or [exclamation 2 noun]
, or even [some noun]
, provided that exclamation
and noun
non-terminals are defined).
And, as usual, one picture is worth a thousand words:
Hope that helps.
think of it like that
a digit can be 1-9
now i will tell you to write down on a page a digit.
so you know that you can write down 1,2,3,4,5,6,7,8,9
basically the nonterminal symbol is "digit"
and the terminals symbols are the 1,2,3,4,5,6,7,8,9
when i told you to write down on a page a digit you wrote down 1 or 2 or 3 or 4 or 5 or 6 or 7 or 8 or 9
you didn't wrote down the word "digit" you wrote down the 1 or 2 or 3....
do you see where i'm going ?
let's try to make our own "rules"
let's "create" a nonterminal symbol we will call it "Olaf"
Olaf can be a dog (NOTE: dog is terminal)
Olaf can be a cat (NOTE: cat is terminal)
Olaf can be a digit (NOTE: digit is nonterminal)
Now i'm telling you that you can write down on a page an Olaf.
so that's mean that you can write down "dog"
you can also write down "cat"
you can also write down a digit so that's mean you can write down 1 or 2 or 3...
because digit is nonterminal symbol you dont write down "digit" you write down the symbols that digit is referring to which is 1 or 2 or 3 etc...
in the end only terminals symbols are written on the "page"
one more thing i have to say is something that you may encounter one day, basically when you say "a nonterminal can be something".
there is a special term for that and that's basically called a "production rule"(can also be called a "production")
for example
Olaf can be "dog"
Olaf can be "cat"
Olaf can be digit
we got 3 productions here in other words we got here 3 definitions of Olaf
specifications of Programming languages use those ideas quite a lot when defining a syntax of a language
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With