How to define a grammar (context-free) for a new programming language (imperative programming language) that you want to design from scratch.
In other words: How do you proceed when you want to create a new programming language from scratch.
The grammar of a programming language is an important asset because it is used in developing many software engineering tools. Sometimes, grammars of languages are not available and have to be inferred from the source code; especially in the case of programming language dialects.
Syntax is to code, like grammar is to English or any other language. A big difference though is that computers are really exacting in how we structure that grammar or our syntax. This syntax is why we call programming coding. Even amongst all the different languages that are out there.
One step at a time.
No seriously, start with expressions and operators, work upwards to statements, then to functions/classes etc. Keep a list of what punctuation is used for what.
In parallel define syntax for referring to variables, arrays, hashes, number literals, string literals, other builtin literal. Also in parallel define your data naming model and scoping rules.
To check whether your grammar makes sense focus on a level (literal/variable, operator, expression, statement, function etc) and make sure that punctuation and tokens from other levels interspersed or appended/prepended is not gonna cause an ambiguity.
Finally write it all out in EBNF and run it through ANTLR or similar.
Also best not to reinvent the wheel. I normally start off by choosing sequences to start and end statement blocks and functions, and mathematical operators, that are usually fundamentally C-like, ECMAScript-like, Basic-like, command-list based or XML-based. This helps a lot cos this is what people are used to working with.
Of course you have to come up with a pretty compelling reason not to abandon writing a new language and just stick with C, ECMAScript, or Basic which are well tested and much used.
I've often started defining new language only to find someone else has already implemented a feature somewhere in some existing language.
If your goal is speed of development for some specific project, you might be better off prototyping in something like Python, Lua or SpiderMonkey if you're looking to get up and running quickly and want to reduce the amount of typing necessary in most compiled languages.
You'll want to have a look at EBNF (Extended Backus-Naur Form).
(Assuming you want to write a context free grammar, that is.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With