So I am making a little toy programming language interpreter, and I would like to try and optimise the code so that the bytecode is slightly smaller. I'm not looking to do very complex optimisations such as loop hoisting, but more simple ones such as constant folding.
My question is, is it better to first generate an AST, optimise that, and then convert to bytecode, or go straight to bytecode, and then try to optimise that?
If anyone has any examples or know of programming languages which do either of these methods it would be greatly appreciated.
Thanks in advance.
In general Python optimises virtually nothing. It won't even optimise trivial things like x = x . Python is so dynamic that doing so correctly would be extremely hard.
The source code is passed through a program called a compiler, which translates it into bytecode that the machine understands and can execute. In contrast, JavaScript has no compilation step. Instead, an interpreter in the browser reads over the JavaScript code, interprets each line, and runs it.
Both approaches are possible. tinycc
for example is a C compiler that started as a toy program for the OCCC. It generates executable code directly in one pass, no AST, but still performs on the fly optimisations at the code generator level.
Another example: wren is an elegant small scripting language with a direct byte code generator without an AST. It performs some optimisations on the byte code, mostly peephole optimisations.
More advanced optimisations are feasible at the byte code level, and I am currently working on a good example that should be published soon, but it seems easier to construct an AST to perform a higher level analysis of the code and generate even better code.
From a theoretical stand point, byte code and AST are 2 representations of the same information, but one seems more practical than the other.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With