Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is a VM and why do dynamic languages need one?

Tags:

So, for example, Python and Java have a VM, C and Haskell do not. (Correct me if I'm wrong)

Thinking about what languages on both sides of the line have, I can't find the reason. Java is static in a lot of ways, while Haskell provides a lot of dynamic features.

like image 871
Pepijn Avatar asked Jan 09 '11 18:01

Pepijn


People also ask

What is a VM in programming languages?

"A virtual machine (VM) is a software implementation of a machine (i.e. a computer) that executes programs like a physical machine."

Which languages use a virtual machine?

This type of VM has become popular with the Java programming language, which is implemented using the Java virtual machine. Other examples include the Parrot virtual machine and the . NET Framework, which runs on a VM called the Common Language Runtime.

What is the exact advantage of programming languages built on top of a virtual machine?

An advantage of VM is that, it is much easier to modify some parts of the code on runtime, which is called Reflection. It brings some elegance capabilities.

Which language is used for dynamic programming?

Popular dynamic programming languages include JavaScript, Python, Ruby, PHP, Lua and Perl.


2 Answers

It's nothing to do with static vs. dynamic.

Rather, it's about becoming independent from the underlying hardware platform ("build once, run everywhere" - in theory...)

Actually, it's nothing to do with the language, either. One could write a C compiler that generates bytecode for the JVM. One could write a Java compiler that generates x86 machine code.

like image 176
Oliver Charlesworth Avatar answered Sep 20 '22 09:09

Oliver Charlesworth


Let's forget about VMs for a sec (we'll get back to those below, I promise), and start with this important fact:

C doesn't have garbage collection.

For a language to provide garbage collection, there has to be some sort of "runtime"/runtime-environment/thing that will perform it.

That's why Python, Java, and Haskell require a "runtime", and C, which does not, can just straight-forwardly compile to native code.

Note that psyco was a Python optimizer that compiled Python code to machine code, however, a lot of that machine code consisted of calls to C-Python's runtime's functions, such as PyImport_AddModule, PyImport_GetModuleDict, etc.

Haskell/GHC is in a similar boat to psyco-compiled Python. Ints are added as simple machine instructions, but more complicated stuff which allocate objects etc, invoke the runtime.

What else?

C doesn't have "exceptions"

If we were to add exceptions to C, our generated machine code would need to do some stuff for every function and for every function call.

If we then add "closures" as well, there would be more stuff added.

Now, instead of having this boilerplate machine code repeated in every function, we could make it instead call a subprocedure to do the necessary stuff, something like PyErr_Occurred.

So now, basically every original source line maps to some calls to some functions and a smaller unique part.

But as long as we're doing so much stuff per original source code line, why even bother with machine code?

Here's an idea (btw let's call this idea a "Virtual Machine").

Let's represent your Python code, which is for example:

def has_no_letters(text):   return text.upper() == text.lower() 

As an in-memory data-structure, for example:

{ 'func_name': 'has_no_letters',   'num_args': 1,   'kwargs': [],   'codez': [     ('get_attr', 'tmp_a', 'arg_0', 'upper'),  # tmp_a = arg_0.upper     ('func_call', 'tmp_b', 'tmp_a', []),  # tmp_b = tmp_a() # tmp_b = arg_0.upper()     ('get_attr', 'tmp_c', 'arg_0', 'lower'),     ('func_call', 'tmp_d', 'tmp_c', []),     ('get_global', 'tmp_e', '=='),     ('func_call', 'tmp_f', 'tmp_e', ['tmp_b', 'tmp_d']),     ('return', 'tmp_f'),   ] } 

Now, let's write an interpreter that executes this in-memory data structure.

Let's discuss the benefits of this over direct-from-text-interpreters, and then the benefits over compiling to machine code.

The benefits of VMs over direct-from-text-interpreters

  • The VM system gives you all the syntax errors before executing the code.
  • When evaluating a loop, a VM system doesn't parse the source code each time it runs.
    • Making the VM faster than the direct-from-text-interpreter.
    • So the direct interpreter runs slower with long variable name, and faster with short variable names. This encourages people to write crappy mathematician-style code such as wt(f, d(o, e), s) <= th(i, s) + cr(a, p * d + o)

The benefits of VMs over compiling to machine code

  • The in-memory data structure describing the program, or the "VM code", will probably be much more compact than boilerplate-full machine code which does the same stuff again and again for every original line of code. This will make the VM system run faster because less "instructions" will need to be fetched from memory.
  • Creating a VM is much simpler than creating a compiler to machine code. You can probably do this now without even knowing any assembly/machine-code.
like image 43
yairchu Avatar answered Sep 20 '22 09:09

yairchu