Known as the front-end of the compiler, the analysis phase of the compiler reads the source program, divides it into core parts and then checks for lexical, grammar and syntax errors.
1. By Keeping the same front end & attaching different back ends, one can produce a compiler for same source language on different machines. 2. By keeping different front ends and same backend, one can compile several different languages on the same machine.
Answer: There are two parts to compilation: analysis and synthesis. The analysis part breaks up the source program into constituent pieces and creates an intermediate representation of the source program. The synthesis part constructs the desired target program from the intermediate representation.
C++ is a widely used programming language nowadays for competitive programming. It is popular as a back-end language too.
The front-end deals with the language itself: scanning, parsing, the parse-tree. The back end deals with the target system: object code formats, the machine code itself, ... The two things don't have all that much to do with each other, and for a portable compiler it is highly desirable to use the same front-end with multiple backends, one per target.
You can take that further, as gcc
does, and have a front/backend interface that is language-independent, so you can use different language front-ends with the same backend. In the old days this was called the MxN problem: you don't want to have to write MxN compilers where you have M languages and N target systems. The idea is to only have to write M+N compilers.
Solution to MxN Problem: A big problem intermediate code solves is that, you don't need a big monolithic compiler for both front-end language parsing and back-end architecture instructions. They call it MxN problem, so instead of MxN combinations of architectures and languages in a monolithic compiler - you get M+N components where M handle the language parsing etc.. while N handle the conversion from the single intermediate language/instructions to the target architecture's instructions.
If you're talking about the front-end being the parser which tokenises the source code, and back-end being the bit which generates executable code based on the tokenised code, then one very good reason is this: portability.
Separating the parser from the executable code generation makes it much easier to port a compiler from one processor architecture to another.
Because you want to use some sort of internal pseudo code or tables/data structures. For example if you have some line of code:
a = b + c;
You would want to take that and break it into an intermediate language or IR (Intermediate representation):
load b
load c
add b + c
store a
as an example -- there are many solutions. The intermediate language is better than going straight to assembly for a particular target for a number of reasons:
ADD
instruction may be stack based, take 1-operand, take 2-operands, or even 3 operands. At this higher level we don't need to know, or care, about the lower level hardware implementation.I dont know enough about it but I think you also have the common used parsers bison/flex, boil you down into some sort of intermediate code/instruction set and then you write a backend for that.
You also benefit that you can for example have a C and C++ and other language front end, without affecting the backend.
You also benefit from breaking the compiler into logical modules blocks, you can develop and test the front end independently from the back end. llvm for example, allows for the export and import of the intermediate language, you could if you really really wanted to write code using the intermediate language and have the benefit of multiple targets on the backend.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With