Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the language of compilers? Are they written with different languages?

Are compilers in different languages?

like image 791
the0roamer Avatar asked Apr 29 '10 21:04

the0roamer


2 Answers

Here's a couple of examples:

  • the Rubinius Ruby compiler is written in Ruby,
  • the YARV Ruby compiler is written in C,
  • the XRuby Ruby compiler was written in Java,
  • the Ruby.NET Ruby compiler was written in C#,
  • the MacRuby Ruby compiler was written in Objective-C,
  • the IronJS ECMAScript compiler is written in F#,
  • the MS Visual F# compiler is written in F#,
  • the MS Visual C# compiler was originally written in C++, is now written in C#,
  • the MS Visual Basic.NET compiler was originally written in C++, now written in Visual Basic.NET,
  • the GCC C compiler is written in C,
  • the Clang C compiler is written in C++,
  • most Pascal compilers are written in Pascal,
  • most Oberon compilers are written in Oberon,
  • the 6g/8g Go compiler is written in Go, originally written in C.
  • the gccgo Go compiler is written in C.

In general, compilers can be written in any language that is actually powerful enough to write a compiler in. This obviously includes any Turing-complete language. But it might even be possible to write a compiler in a non-Turing-complete language. (For example, I don't see any obvious reason why a compiler couldn't be a total function, but total functions are obviously not Turing-complete.)

In practice, however, compilers are mostly written in three specific classes of languages with different pros and cons:

  1. the same language that the compiler implements (pros: larger community, because everybody who knows the language can work on the compiler, otherwise they would need to know both languages; cons: the bootstrap problem)
  2. the primary low-level systems programming language of the platform the compiler is supposed to run on, e.g. C on Unix, Java on the JVM, C# on the CLI (pros: very fast; cons: oftentimes those languages are simply not very good for writing compilers, also I don't actually believe that the performance benefits are real)
  3. a language that is very good for writing compilers like ML, Haskell, Lisp, Scheme (pros: those compilers tend to be very easy to understand and hack on; cons: you still need to know both languages)
  4. special case of the above: a domain-specific language for writing compilers, like OMeta or for the parsing frontend ANTLR, YACC (pros: same as above but even more so; cons: same as above)

All of these are essentially tradeoffs: writing the compiler in the same language makes it easier to understand, because you don't have to learn another language. It can also make it harder to understand because the language isn't actually very good at writing compilers. (Imagine, for example, writing a SQL compiler in SQL.) It might even be impossible to write a compiler, for example (for a pretty loose definition of "language" and "compiler") it is impossible to write a CSS compiler in CSS or an HTML compiler in HTML.

On the opposite side: writing the compiler in a specialized compiler-writing language probably makes it easier to understand, but at the same time it requires you to learn a new language.

Note that the three classes are not disjoint: a compiler can fall into more than one class. For example, a compiler for a specialized compiler-writing language, written in itself falls both into category 1 (written in itself) and 3 (written in a language good at writing compilers).

In some cases, you are actually able to hit the sweet spot. For example, F# is a native language with native speed on the CLI, and it is very good at writing compilers. So, writing the F# compiler in F# gives you #1 (writing in itself), #2 (writing in a native, fast language) and #3 (writing in a language that is good for writing compilers). The same applies to Scala.

like image 193
Jörg W Mittag Avatar answered Nov 15 '22 11:11

Jörg W Mittag


A compiler could probably be written in any language. In its most basic form, a compiler merely converts code from one language to another. In the sense that most people use the term "compiler" today, they are referring to something that takes in source code of some higher level language and converts it to either assembly or some low level intermediate language (CIL).

like image 27
gehsekky Avatar answered Nov 15 '22 12:11

gehsekky