In my ongoing effort to quench my undying thirst for more programming knowledge I have come up with the idea of attempting to write a (at least for now) simple programming language that compiles into bytecode. The problem is I don't know the first thing about language design. Does anyone have any advice on a methodology to build a parser and what the basic features every language should have? What reading would you recommend for language design? How high level should I be shooting for? Is it unrealistic to hope to be able to include a feature to allow one to inline bytecode in a way similar to gcc allowing inline assembler? Seeing I primarily code in C and Java which would be better for compiler writing?
A program design methodology comprises of two basic modules: the design process and the program design representation using diagrams or flowcharts. The design methodology is frequently controlled by the given hardware specifications, programming language, data structures to be used and other components within a system.
IMPLEMENTATION METHODS 1. Compilation – Programs are translated into machine Language & System calls 2. Interpretation – Programs are interpreted by another program (an interpreter) 3. Hybrid – Programs translated into an intermediate language for easy interpretation 4.
You might want to read a book on compilers first.
For really understanding what's going on, you'll likely want to write your code in C.
Java wouldn't be a bad choice if you wanted to write an interpreted language, such as Jython. But since it sounds like you want to compile down to machine code, it might be easier in C.
There are so many ways...
You could look into stack languages and Forth. It's not very useful when it comes to designing other languages, but it's something that can be done very quickly.
You could look into functional languages. Most of them are based on a few simple concepts, and have simple parsing. And, yet, they are very powerful.
And, then, the traditional languages. They are the hardest. You'll need to learn about lexical analysers, parsers, LALR grammars, LL grammars, EBNF and regular languages just to get past the parsing.
Targeting a bytecode is not just a good idea – doing otherwise is just insane, and mostly useless, in a learning exercise.
Do yourself a favour, and look up books and tutorials about compilers.
Either C or Java will do. Java probably has an advantage, as object orientation is a good match for this type of task. My personal recommendation is Scala. It's a good language to do this type of thing, and it will teach you interesting things about language design along the way.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With