Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How much time would it take to write a C++ compiler using flex/yacc?

Tags:

How much time would it take to write a C++ compiler using lex/yacc?

Where can I get started with it?

like image 943
Madhu Avatar asked Dec 25 '09 17:12

Madhu


People also ask

How long does it take to write a compiler?

To build a simple working compiler, it can be around 2–3 months of work, sometimes even less. I wrote an interpreter, mainly same as compiler but executes the code instead of generating it in around 2 months.

How hard is it to write a C++ compiler?

A C++ compiler is very complicated. To implement enough of C++ to be compatible with most C++ code out there would take several developers a couple of years full time.

Can yacc parse C?

Yacc takes a concise description of a grammar and produces a C routine that can parse that grammar, a parser. The yacc parser automatically detects whenever a sequence of input tokens matches one of the rules in the grammar and also detects a syntax error whenever its input doesn't match any of the rules.


2 Answers

There are many parsing rules that cannot be parsed by a bison/yacc parser (for example, distinguishing between a declaration and a function call in some circumstances). Additionally sometimes the interpretation of tokens requires input from the parser, particularly in C++0x. The handling of the character sequence >> for example is crucially dependent on parsing context.

Those two tools are very poor choices for parsing C++ and you would have to put in a lot of special cases that escaped the basic framework those tools rely on in order to correctly parse C++. It would take you a long time, and even then your parser would likely have weird bugs.

yacc and bison are LALR(1) parser generators, which are not sophisticated enough to handle C++ effectively. As other people have pointed out, most C++ compilers now use a recursive descent parser, and several other answers have pointed at good solutions for writing your own.

C++ templates are no good for handling strings, even constant ones (though this may be fixed in C++0x, I haven't researched carefully), but if they were, you could pretty easily write a recursive descent parser in the C++ template language. I find that rather amusing.

like image 166
Omnifarious Avatar answered Sep 22 '22 21:09

Omnifarious


It sounds like you're pretty new to parsing/compiler creation. If that's the case, I'd highly recommend not starting with C++. It's a monster of a language.

Either invent a trivial toy language of your own, or do something modeled on something much smaller and simpler. I saw a lua parser where the grammar definition was about a page long. That'd be much more reasonable as a starting point.

like image 33
kyoryu Avatar answered Sep 22 '22 21:09

kyoryu