Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the *conceptually* smallest *compiler* that can compile itself?

In the spirit of this question, I'd like to ask a similar question, but about compilers, not about interpreters.

What is the conceptually smallest compiler that can compile its own code?

When I say "conceptually smallest" I mean that it uses only very basic concepts and builds up from there, rather than that it contains very short code. An example of why this is an important distinction is OTCC a very tiny C compiler which is small because it's obfuscated, not necessarily because it is conceptually simple (it may also be conceptually simple, but I don't know; it's obfuscated).

I would also like to add that the following also might be a very conceptually small program, but it doesn't actually tell us anything about what's going on, so it's not really what I'm looking for either:

(writefile argv[2] (generate (parse (readfile argv[1]))))

What I'm really looking for is a language that is:

  1. Turing complete.
  2. Capable of compiling itself.

I'm interested in this because

  1. it would be an interesting case study and
  2. it could be useful as a starting point for bootstrapping compilers.

If it doesn't exist, I may just write it myself. :)

like image 347
Imagist Avatar asked Oct 27 '09 12:10

Imagist


People also ask

Can a compiler compile itself?

A compiler which can compile its own sources is called a self-hosting compiler. Early compilers were written in another language. For example, the first C compiler was probably written in assembler. The whole trick in using a former lower level compiler is called bootstrapping.

What does it mean for a compiler to compile itself?

The idea of a compiler that compiles its own code is vaguely similar to a quine: source code that, when executed, produces as output the original source code.

What is compiler * Your answer?

A compiler is a special program that translates a programming language's source code into machine code, bytecode or another programming language. The source code is typically written in a high-level, human-readable language such as Java or C++.


2 Answers

I'm not really clear what you mean by 'conceptually smallest'. Presumably you're not interested in minimal Turing machines or representations in Lambda calculus? If you're talking about physical compiler implementations, then you're really talking about a compiler that generates machine code instructions. TCC, as mentioned by Anthony Mills' comment, is relevant. Another interesting discussion that should have practical application is this detailed description of a bootstrapping compiler written from scratch.

There was an interesting discussion a while back on the comp.compilers newsgroup that's worth checking out.

like image 170
ire_and_curses Avatar answered Oct 20 '22 11:10

ire_and_curses


You don't say what the target machine is or whether the compiler has to exist or just be imagined.

In the world of the imagination, I would say that an adaptation of John McCarthy's metacircular LISP interpreter would come pretty close. You might also want to look at John Reynold's paper Definitional Interpreters for Higher-Order Languages which although dense is a model of simplicity.

In the world of reality, I'd bet on Chez Scheme, but unfortunately the native-code compiler is proprietary and closed source. Still, you can learn from studying the interpreter. Another system worth studying is the Oberon compiler, which was designed to be built and understood by one person, and it is very clean.

like image 40
Norman Ramsey Avatar answered Oct 20 '22 09:10

Norman Ramsey