Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Compilation vs translation, "compiling" Java to bytecode?

My understanding is like this, definitions:

Translation - having code in some language, generating code in some other language.

Compilaton - translation to a machine code.

Machine code - direct instructions for CPU.

Now, from docs.oracle.com:

javac - Java programming language compiler

Compiler...? I think it is Java translator, because it is generating a code, that is not a machine code. Bytecode needs interpreter (JVM) to run, so it's definetely not a machine code.

From Wikipedia:

Java applications are typically compiled to bytecode

Similarly. According to definitions, I would say that Java is traslated to bytecode. There are many more examples on the Internet, I think there is confusion about that or I'm just missing something.

Could you please clarify this? What is the difference between translation and compilation?

like image 957
Adam Stelmaszczyk Avatar asked May 18 '13 09:05

Adam Stelmaszczyk


4 Answers

It's all a matter of definitions, and there's no single accepted definition for what "compilation" means. In your eyes, compilation is transforming a source code in some language to native; so a transformation process which doesn't generate machine code shouldn't be called "compilation". In my eyes (and apparently, the javac documentation writers' eyes as well), it should.

There are actually a lot of different terms: translation, compilation, decompilation, assembly, disassembly, and more.

Personally, I'd say it makes sense to group all of these terms under "compilation", because all these processes have a lot in common:

  • They transform code in one formal language to code in another formal language.
  • They try to preserve the semantics of the input code as much as possible.
  • They all have a very similar design to each other, with a front-end, a back-end, and a possible optimizer in the middle (learn more about compiler structure here). I've seen the entrails of both javac and native compilers and they are relatively similar.

In addition, your definition of "produces native code" is problematic:

  • What about compilers that can generate assembly but don't bother transforming that to machine code, leaving this to an external program (commonly called "assembler")? Would you deny them this definition of "compilers" because of that last, insignificant-in-comparison step?
  • How do you even classify "machine code"? What if tomorrow a processor which can run Java Bytecode natively is created?

But these are just my opinions. I think that out there, the most accepted definitions are that:

  • Compilation is transforming code in a higher-level language to a lower-level one. Examples: Java to Java Bytecode, or C to x86 machine code.
  • Decompilation is transforming a code in a lower-level language to a higher-level one - in effect, the opposite of compilation. Examples: Java Bytecode to Java.
  • Translation or source-to-source compilation is transforming a code in some language to another language of comparable "level". Examples: ARM to x86, or C to Java. When the two languages are actually different versions of the same language (e.g. Javascript 6 to Javascript 5), the term transpiler is also used.
  • Assembly is transforming code in some assembly language to machine code.
  • Disassembly is either a synonym to decompilation or the opposite of assembly, depending on the context.

Under these definitions, javac could definitely be considered as a compiler. But again, it's all in the definitions: from a technical standpoint, many of these actions have a lot in common.

like image 117
Oak Avatar answered Oct 17 '22 22:10

Oak


The result of javac is machine code. The fact that the machine is virtual and not physical is not relevant (otherwise, you could argue that compiling code in x86 was translation if you were a Mac user, since the x86 code is not Mac machine code).

like image 39
SJuan76 Avatar answered Oct 17 '22 22:10

SJuan76


"A compiler is a computer program (or set of programs) that transforms source code written in a programming language (the source language) into another computer language (the target language, often having a binary form known as object code)."

http://en.wikipedia.org/wiki/Compiler

So no a compilation doesn't mean that the output is in machine code. For example early C++ compilation would generate a C program, that then needed to be compiled again into machine code. Of course any good compiler would hide these separate steps from the user, but they are still there.

Nowday I know at least of the NesC compiler that does the same procedure.

A machine that runs JVM bytecode can be built, actually some chapters of Structured Computer Organization, from A. Tanenbaum describe how to do that.

http://www.amazon.com/Structured-Computer-Organization-5th-Edition/dp/0131485210

like image 45
LtWorf Avatar answered Oct 17 '22 21:10

LtWorf


Compiler...? I think it is Java translator, because it is generating a code, that is not a machine code. Bytecode needs interpreter (JVM) to run, so it's definetely not a machine code.

The JVM is the Java Virtual Machine, which is a machine, and its machine code is called Java byte code. (Each "byte" in byte code is a JVM machine instruction.)

You can learn more by reading the JVM specification. Start here: https://docs.oracle.com/javase/specs/

Also, the JVM is not an interpreter; it is a definition of a machine. Some implementations of the JVM include just-in-time (JIT) compilers, ahead-of-time (AOT) compilers, adaptive compilers (e.g. Hotspot), and yes, even interpreters (although I haven't personally seen a Java interpreter in 20 years now).

What most people starting out in compilers don't understand is that there are no compilers anymore that generate "machine code". They all generate some intermediate forms (as defined by a particular OS, for example) that are then loaded and munged by the OS.

For the most part today, no compiled program is capable of being executed without a large, bloated OS to munge it and slice it and glue it into pieces that the OS owns and manages.

Not even C compiles code into actual machine executables nowadays without coloring outside the lines. (Write an actual, working boot loader from scratch, and get back to me.)

like image 25
cpurdy Avatar answered Oct 17 '22 21:10

cpurdy