Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the design rationale behind HandleScope?

Tags:

v8

V8 requires a HandleScope to be declared in order to clean up any Local handles that were created within scope. I understand that HandleScope will dereference these handles for garbage collection, but I'm interested in why each Local class doesn't do the dereferencing themselves like most internal ref_ptr type helpers.

My thought is that HandleScope can do it more efficiently by dumping a large number of handles all at once rather than one by one as they would in a ref_ptr type scoped class.

like image 422
aughey Avatar asked Mar 01 '12 04:03

aughey


2 Answers

Here is how I understand the documentation and the handles-inl.h source code. I, too, might be completely wrong since I'm not a V8 developer and documentation is scarce.

The garbage collector will, at times, move stuff from one memory location to another and, during one such sweep, also check which objects are still reachable and which are not. In contrast to reference-counting types like std::shared_ptr, this is able to detect and collect cyclic data structures. For all of this to work, V8 has to have a good idea about what objects are reachable.

On the other hand, objects are created and deleted quite a lot during the internals of some computation. You don't want too much overhead for each such operation. The way to achieve this is by creating a stack of handles. Each object listed in that stack is available from some handle in some C++ computation. In addition to this, there are persistent handles, which presumably take more work to set up and which can survive beyond C++ computations.

Having a stack of references requires that you use this in a stack-like way. There is no “invalid” mark in that stack. All the objects from bottom to top of the stack are valid object references. The way to ensure this is the LocalScope. It keeps things hierarchical. With reference counted pointers you can do something like this:

shared_ptr<Object>* f() {
    shared_ptr<Object> a(new Object(1));
    shared_ptr<Object>* b = new shared_ptr<Object>(new Object(2));
    return b;
}
void g() {
    shared_ptr<Object> c = *f();
}

Here the object 1 is created first, then the object 2 is created, then the function returns and object 1 is destroyed, then object 2 is destroyed. The key point here is that there is a point in time when object 1 is invalid but object 2 is still valid. That's what LocalScope aims to avoid.

Some other GC implementations examine the C stack and look for pointers they find there. This has a good chance of false positives, since stuff which is in fact data could be misinterpreted as a pointer. For reachability this might seem rather harmless, but when rewriting pointers since you're moving objects, this can be fatal. It has a number of other drawbacks, and relies a lot on how the low level implementation of the language actually works. V8 avoids that by keeping the handle stack separate from the function call stack, while at the same time ensuring that they are sufficiently aligned to guarantee the mentioned hierarchy requirements.

To offer yet another comparison: an object references by just one shared_ptr becomes collectible (and actually will be collected) once its C++ block scope ends. An object referenced by a v8::Handle will become collectible when leaving the nearest enclosing scope which did contain a HandleScope object. So programmers have more control over the granularity of stack operations. In a tight loop where performance is important, it might be useful to maintain just a single HandleScope for the whole computation, so that you won't have to access the handle stack data structure so often. On the other hand, doing so will keep all the objects around for the whole duration of the computation, which would be very bad indeed if this were a loop iterating over many values, since all of them would be kept around till the end. But the programmer has full control, and can arrange things in the most appropriate way.

Personally, I'd make sure to construct a HandleScope

  • At the beginning of every function which might be called from outside your code. This ensures that your code will clean up after itself.
  • In the body of every loop which might see more than three or so iterations, so that you only keep variables from the current iteration.
  • Around every block of code which is followed by some callback invocation, since this ensures that your stuff can get cleaned if the callback requires more memory.
  • Whenever I feel that something might produce considerable amounts of intermediate data which should get cleaned (or at least become collectible) as soon as possible.

In general I'd not create a HandleScope for every internal function if I can be sure that every other function calling this will already have set up a HandleScope. But that's probably a matter of taste.

like image 188
MvG Avatar answered Dec 17 '22 14:12

MvG


Disclaimer: This may not be an official answer, more of a conjuncture on my part; but the v8 documentation is hardly useful on this topic. So I may be proven wrong.

From my understanding, in developing various v8 based backed application. Its a means of handling the difference between the C++ and javaScript environment.

Imagine the following sequence, which a self dereferencing pointer can break the system.

  1. JavaScript calls up a C++ wrapped v8 function : lets say helloWorld()
  2. C++ function creates a v8::handle of value "hello world =x"
  3. C++ returns the value to the v8 virtual machine
  4. C++ function does its usual cleaning up of resources, including dereferencing of handles
  5. Another C++ function / process, overwrites the freed memory space
  6. V8 reads the handle : and the data is no longer the same "hell!@(#..."

And that's just the surface of the complicated inconsistency between the two; Hence to tackle the various issues of connecting the JavaScript VM (Virtual Machine) to the C++ interfacing code, i believe the development team, decided to simplify the issue via the following...

  • All variable handles, are to be stored in "buckets" aka HandleScopes, to be built / compiled / run / destroyed by their respective C++ code, when needed.
  • Additionally all function handles, are to only refer to C++ static functions (i know this is irritating), which ensures the "existence" of the function call regardless of constructors / destructor.

Think of it from a development point of view, in which it marks a very strong distinction between the JavaScript VM development team, and the C++ integration team (Chrome dev team?). Allowing both sides to work without interfering one another.

Lastly it could also be the sake of simplicity, to emulate multiple VM : as v8 was originally meant for google chrome. Hence a simple HandleScope creation and destruction whenever we open / close a tab, makes for much easier GC managment, especially in cases where you have many VM running (each tab in chrome).

like image 27
PicoCreator Avatar answered Dec 17 '22 14:12

PicoCreator