Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Lifespan of JS closure context objects?

Background

I'm trying to port the elixir's actor model language primitives into JS. I came up with a solution (in JS) to emulate the receive elixir keyword, using a "receiver" function and a generator.

Here's a simplified implementation and demo to show you the idea.

APIs:

type ActorRef: { send(msg: any): void }
type Receiver = (msg: any) => Receiver
/**
 * `spawn` takes a `initializer` and returns an `actorRef`.
 * `initializer` is a factory function that should return a `receiver` function.
 * `receiver` is called to handle `msg` sent through `actorRef.send(msg)`
 */
function spawn(initializer: () => Receiver): ActorRef

Demo:

function* coroutine(ref) {
  let result
  while (true) {
    const msg = yield result
    result = ref.receive(msg)
  }
}

function spawn(initializer) {
  const ref = {}
  const receiver = initializer()
  ref.receive = receiver
  const gen = coroutine(ref)
  gen.next()

  function send(msg) {
    const ret = gen.next(msg)
    const nextReceiver = ret.value
    ref.receive = nextReceiver
  }

  return { send }
}

function loop(state) {
  console.log('current state', state)
  return function receiver(msg) {
    if (msg.type === 'ADD') {
      return loop(state + msg.value)
    } else {
      console.log('unhandled msg', msg)
      return loop(state)
    }
  }
}

function main() {
  const actor = spawn(() => loop(42))
  actor.send({ type: 'ADD', value: 1 })
  actor.send({ type: 'BLAH', value: 1 })
  actor.send({ type: 'ADD', value: 1 })
  return actor
}

window.actor = main()

Concern

Above model works. However I'm a bit concern about the performance impact of this approach, I'm not clear about the memory impact of all the closure contexts it creates.

function loop(state) {
  console.log('current state', state) // <--- `state` in a closure context  <─┐    <─────┐
  return function receiver(msg) {     // ---> `receiver` closure reference  ──┘          │
    if (msg.type === 'ADD') {                                                            │
      return loop(state + msg.value)  // ---> create another context that link to this one???
    } else {
      console.log('unhandled msg', msg)
      return loop(state)
    }
  }
}

loop is the "initializer" that returns a "receiver". In order to maintain a internal state, I keep it (state variable) inside the closure context of the "receiver" function.

When receive a message, the current receiver can modifies the internal state, and pass it to loop and recursively create a new receiver to replace current one.

Apparently the new receiver also has a new closure context that keeps the new state. This process seems to me may create a deep chain of linked context objects that prevents GC?

I know that context objects referenced by closure could be linked under some circumstance. And if they're linked, they are obviously not released before the inner-most closure is released. According to this article V8 optimization is very conservative on this regard, the picture doesn't look pretty.

Questions

I'd be very grateful if someone can answer these questions:

  1. Does the loop example creates deeply linked context objects?
  2. What does the lifespan of context object look like in this example?
  3. If current example does not, can this receiver creates receiver mechanism ends up creating deeply linked context objects under other situation?
  4. If "yes" to question 3, can you please show an example to illustrate such situation?

Follow-Up 1

A follow-up question to @TJCrowder.

Closures are lexical, so the nesting of them follows the nesting of the source code.

Well said, that's something obvious but I missed 😅

Just wanna confirm my understanding is correct, with an unnecessarily complicated example (pls bear with me).

These two are logically equivalent:

// global context here

function loop_simple(state) {
  return msg => {
    return loop_simple(state + msg.value)
  }
}

// Notations:
// `c` for context, `s` for state, `r` for receiver.
function loop_trouble(s0) { // c0 : { s0 }
  // return r0
  return msg => {   // c1 : { s1, gibberish } -> c0
    const s1 = s0 + msg.value
    const gibberish = "foobar"
    // return r1
    return msg => { // c2 : { s2 } -> c1 -> c0
      const s2 = s1 + msg.value
      // return r2
      return msg => {
        console.log(gibberish)
        // c3 is not created, since there's no closure
        const s3 = s2 + msg.value
        return loop_trouble(s3)
      }
    }
  }
}

However the memory impact is totally different.

  1. step into loop_trouble, c0 is created holding s0; returns r0 -> c0.
  2. step into r0, c1 is created, holding s1 and gibberish, returns r1 -> c1.
  3. step into r1, c2 is created, holding s2, returns r2 -> c2

I believe in the above case, when r2 (the inner most arrow function) is used as the "current receiver", it's actually not just r2 -> c2, but r2 -> c2 -> c1 -> c0, all three context objects are kept (Correct me if I'm already wrong here).

Question: which case is true?

  1. All three context objects are kept simply because of the gibberish variable that I deliberately put in there.
  2. Or they're kept even if I remove gibberish. In other word, the dependency of s1 = s0 + msg.value is enough to link c1 -> c0.

Follow-Up 2

So environment record as a "container" is always retained, as of what "content" is included in the container might vary across engines, right?

A very naive unoptimized approach could be blindly include into the "content" all local variables, plus arguments and this, since the spec didn't say anything about optimization.

A smarter approach could be peek into the nest function and check what exactly is needed, then decide what to include into content. This is referred as "promotion" in the article I linked, but that piece of info dates back to 2013 and I'm afraid it might be outdated.

By any chance, do you have more up-to-date information on this topic to share? I'm particularly interested in how V8 implements such strategy, cus my current work heavily relies on electron runtime.

like image 646
hackape Avatar asked Apr 07 '21 08:04

hackape


1 Answers

Note: This answer assumes you're using strict mode. Your snippet doesn't. I recommend always using strict mode, by using ECMAScript modules (which are automatically in strict mode) or putting "use strict"; at the top of your code files. (I'd have to think more about arguments.callee.caller and other such monstrosities if you wanted to use loose mode, and I haven't below.)

  1. Does the loop example creates deeply linked context objects?

Not deeply, no. The inner calls to loop don't link the contexts those calls create to the context where the call to them was made. What matters is where the function loop was created, not where it was called from. If I do:

const r1 = loop(1);
const r2 = r1({type: "ADD", value: 2});

That creates two functions, each of which closes over the context in which it was created. That context is the call to loop. That call context links to the context where loop is declared — global context in your snippet. The contexts for the two calls to loop don't link to each other.

  1. What does the lifespan of context object look like in this example?

Each of them is retained as long as the receiver function referring to it is retained (at least in specification terms). When the receiver function no longer has any references, it and the context are both eligible for GC. In my example above, r1 doesn't retain r2, and r2 doesn't retain r1.

  1. If current example does not, can this receiver creates receiver mechanism ends up creating deeply linked context objects under other situation?

It's hard to rule everything out, but I wouldn't think so. Closures are lexical, so the nesting of them follows the nesting of the source code.

  1. If "yes" to question 3, can you please show an example to illustrate such situation?

N/A


Note: In the above I've used "context" the same way you did in the question, but it's probably worth noting that what's retained is the environment record, which is part of the execution context created by a call to a function. The execution context isn't retained by the closure, the environment record is. But the distinction is a very minor one, I mention it only because if you're delving into the spec, you'll see that distinction.


Re your Follow-Up 1:

c3 is not created, since there's no closure

c3 is created, it's just that it isn't retained after the end of the call, because nothing closes over it.

Question: which case is true?

Neither. All three contexts (c0, c1, and c2) are kept (at least in specification terms) regardless of whether there's a gibberish variable or an s0 parameter or s1 variable, etc. A context doesn't have to have parameters or variables or any other bindings in order to exist. Consider:

// ge = global environment record

function f1() {
    // Environment record for each call to f1: e1(n) -> ge
    return function f2() {
        // Environment record for each call to f2: e2(n) -> e1(n) -> ge
        return function f3() {
            // Environment record for each call to f3: e3(n) -> e2(n) -> e1(n) -> ge
        };
    };
}

const f = f1()();

Even though e1(n), e2(n), and e3(n) have no parameters or variables, they still exist (and in the above they'll have at least two bindings, one for arguments and one for this, since those aren't arrow functions). In the code above e1(n) and e2(n) are both retained as long as f continues to refer to the f3 function created by f1()().

At least, that's how the specification defines it. In theory those environment records could be optimized away, but that's a detail of the JavaScript engine implementation. V8 did some closure optimization at one stage but backed off most of it because (as I understand it) it cost more in execution time than it made up for in memory reduction. But even when they were optimizing, I think it was the contents of the environment records they optimized (removing unused bindings, that sort of thing), not whether they continued to exist. See below, I found a blog post from 2018 indicating that they do leave them out entirely sometimes.


Re Follow-Up 2:

So environment record as a "container" is always retained...

In specification terms, yes; that isn't necessarily what engines literally do.

...as of what "content" is included in the container might vary across engines, right?

Right, all the spec dictates is behavior, not how you achieve it. From the section on environment records linked above:

Environment Records are purely specification mechanisms and need not correspond to any specific artefact of an ECMAScript implementation.

...but that piece of info dates back to 2013 and I'm afraid it might be outdated.

I think so, yes, not least because V8 has changed engines entirely since then, replacing Full-codegen and Crankshaft with Ignition and TurboFan.

By any chance, do you have more up-to-date information on this topic to share?

Not really, but I did find this V8 blog post from 2018 which says they do "elide" context allocation in some cases. So there is definitely some optimization that goes on.

like image 118
T.J. Crowder Avatar answered Nov 14 '22 06:11

T.J. Crowder