Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

AST or bytecode. Which is easier to optimise?

So I am making a little toy programming language interpreter, and I would like to try and optimise the code so that the bytecode is slightly smaller. I'm not looking to do very complex optimisations such as loop hoisting, but more simple ones such as constant folding.

My question is, is it better to first generate an AST, optimise that, and then convert to bytecode, or go straight to bytecode, and then try to optimise that?

If anyone has any examples or know of programming languages which do either of these methods it would be greatly appreciated.

Thanks in advance.

like image 404
dangee1705 Avatar asked Aug 13 '18 17:08

dangee1705


People also ask

Is Python bytecode optimized?

In general Python optimises virtually nothing. It won't even optimise trivial things like x = x . Python is so dynamic that doing so correctly would be extremely hard.

Does JavaScript compile to bytecode?

The source code is passed through a program called a compiler, which translates it into bytecode that the machine understands and can execute. In contrast, JavaScript has no compilation step. Instead, an interpreter in the browser reads over the JavaScript code, interprets each line, and runs it.


1 Answers

Both approaches are possible. tinycc for example is a C compiler that started as a toy program for the OCCC. It generates executable code directly in one pass, no AST, but still performs on the fly optimisations at the code generator level.

Another example: wren is an elegant small scripting language with a direct byte code generator without an AST. It performs some optimisations on the byte code, mostly peephole optimisations.

More advanced optimisations are feasible at the byte code level, and I am currently working on a good example that should be published soon, but it seems easier to construct an AST to perform a higher level analysis of the code and generate even better code.

From a theoretical stand point, byte code and AST are 2 representations of the same information, but one seems more practical than the other.

like image 199
chqrlie Avatar answered Oct 05 '22 23:10

chqrlie