Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why are Virtual Machines necessary?

I was reading this question to find out the differences between the Java Virtual Machine and the .NET CLR and Benji's answer got me wondering why Virtual Machines are necessary in the first place.

From my understanding of Benji's explanation, the JIT compiler of a Virtual Machine interprets the intermediate code into the actual assembly code that runs on the CPU. The reason it has to do this is because CPUs often have different numbers of registers and according to Benji, "some registers are special-purpose, and each instruction expects its operands in different registers." This makes sense then that there is a need for an intermediary interpreter like the Virtual Machine so that the same code can be run on any CPU.

But, if that's the case, then what I don't understand is why C or C++ code compiled into machine code is able to run on any computer as long as it is the correct OS. Why then would a C program I compiled on my Windows machine using a Pentium be able to run on my other Windows machine using an AMD?

If C code can run on any CPU then what is the purpose of the Virtual Machine? Is it so that the same code can be run on any OS? I know Java has VM versions on pretty much any OS but is there a CLR for other OS's besides Windows?

Or is there something else I'm missing? Does the OS do some other interpretation of assembly code it runs to adapt it to the particular CPU or something?

I'm quite curious about how all this works, so a clear explanation would be greatly appreciated.

Note: The reason I didn't just post my queries as comments in the JVM vs. CLR question is because I don't have enough points to post comments yet =b.

Edit: Thanks for all the great answers! So it seems what I was missing was that although all processors have differences there is a common standardization, primarily the X86 architecture, which provides a large enough set of common features so that the C code compiled on one X86 processor will work for the most part on another X86 processor. This furthers the justification for Virtual Machines, not to mention I forgot about the importance of garbage collection.

like image 517
Daniel Avatar asked Jan 31 '09 01:01

Daniel


2 Answers

The AMD and intel processors use the same instruction set and machine architecture (from the standpoint of execution of machine code).

C and C++ compilers compile to machine code, with headers appropriate to the OS they are targeted at. Once compiled they cease to associate in any way, shape, or form with the language they were compiled in and are merely binary executables. (there are artifacts taht may show what language it was compiled from, but that isn't the point here)

So once compiled, they are associated to the machine (X86, the intel and amd instruction set and architecture) and the OS.

This is why they can run on any compatible x86 machine, and any compatible OS (win95 through winvista, for some software).

However, they cannot run on an OSX machine, even if it's running on an intel processor - the binary isn't compatible unless you run additional emulation software (such as parallels, or a VM with windows).

Beyond that, if you want to run them on an ARM processor, or MIPS, or PowerPC, then you have to run a full machine instruction set emulator that interprets the binary machine code from X86 into whatever machine you're running it on.

Contrast that with .NET.

The .NET virtual machine is fabricated as though there were much better processors out in the world - processors that understand objects, memory allocation and garbage collection, and other high level constructs. It's a very complex machine and can't be built directly in silicon now (with good performance) but an emulator can be written that will allow it to run on any existing processor.

Suddenly you can write one machine specific emulator for any processor you want to run .NET on, and then ANY .NET program can run on it. No need to worry about the OS or the underlying CPU architecture - if there's a .NET VM, then the software will run.

But let's go a bit further - once you have this common language, why not make compilers that convert any other written language into it?

So now you can have a C, C#, C++, Java, javascript, Basic, python, lua, or any other language compiler that converts written code so it'll run on this virtual machine.

You've disassociated the machine from the language by 2 degrees, and with not too much work you enable anyone to write any code and have it run on any machine, as long as a compiler and a VM exists to map the two degrees of separation.

If you're still wondering why this is a good thing, consider early DOS machines, and what Microsoft's real contribution to the world was:

Autocad had to write drivers for each printer they could print to. So did lotus 1-2-3. In fact, if you wanted your software to print, you had to write your own drivers. If there were 10 printers, and 10 programs, then 100 different pieces of essentially the same code had to be written separately and independently.

What windows 3.1 tried to accomplish (along with GEM, and so many other abstraction layers) is make it so the printer manufacturer wrote one driver for their printer, and the programmer wrote one driver for the windows printer class.

Now with 10 programs and 10 printers, only 20 pieces of code have to be written, and since the microsoft side of the code was the same for everyone, then examples from MS meant that you had very little work to do.

Now a program wasn't restricted to just the 10 printers they chose to support, but all the printers whose manufacturers provided drivers for in windows.

The same issue is occurring in application development. There are really neat applications I can't use because I don't use a MAC. There is a ton of duplication (how many world class word processors do we really need?).

Java was meant to fix this, but it had many limitations, some of which aren't really solved.

.NET is closer, but no one is developing world-class VMs for platforms other than Windows (mono is so close... and yet not quite there).

So... That's why we need VMs. Because I don't want to limit myself to a smaller audience simply because they chose an OS/machine combination different from my own.

-Adam

like image 57
Adam Davis Avatar answered Sep 29 '22 17:09

Adam Davis


Your assumption that C code can run on any processor is incorrect. There are things like registers and endianness which will make compiled C programs not work at all on one platform, while it might work on another.

However, there are certain similarities that processors share, for example, Intel x86 processors and AMD processors share a large enough set of properties that most code compiled against one will run on the other. However, if you want to use processor-specific properties, then you need a compiler or set of libraries which will do that for you.

As for why you would want a virtual machine, beyond the statement that it will handle differences in processors for you, there is also the fact that virtual machines offer services to code that are not available to programs compiled in C++ (not managed) today.

The most prominent service offered is garbage collection, offered by the CLR and the JVM. Both of these virtual machines offer you this service for free. They manage the memory for you.

Things like bounds checking, access violations (while still possible, they are extremely difficult) are also offered.

The CLR also offers a form of code security for you.

None of these are offered as part of the basic runtime environment for a number of other languages which don't operate with a virtual machine.

You might get some of them by using libraries, but then that forces you into a pattern of use with the library, whereas in .NET and Java services that are offered to you through the CLR and JVM are consistent in their access.

like image 20
casperOne Avatar answered Sep 29 '22 18:09

casperOne