Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

node async_hooks destroy lifecycle event is not called for all resources

I've been exploring the async_hooks API for the purposes of tracking state across asynchronous events. I'm finding that the destroy callback isn't always called for each corresponding init callbacks.

Here's a simple repro:

const asyncHooks = require("async_hooks");
const fs = require("fs");
const https = require("https");

asyncHooks
  .createHook({
    init(asyncId, type, triggerId) {
      fs.writeSync(1, `init ${asyncId} ${triggerId} ${type}\n`);
    },
    destroy(asyncId) {
      fs.writeSync(1, `destroy ${asyncId}\n`);
    },
    promiseResolve(asyncId) {
      fs.writeSync(1, `promiseResolve ${asyncId}\n`);
    }
  })
  .enable();

https.get("https://www.google.com", res => {
  console.log("status code - " + res.statusCode);
});

The above logs all the init and destroy callbacks when making a simple HTTP request.

Here is the output:

$ node bug.js
* init 5 1 TCPWRAP
* init 6 1 TLSWRAP
init 7 1 TickObject
* init 8 1 DNSCHANNEL
init 9 6 GETADDRINFOREQWRAP
init 10 1 TickObject
* init 11 10 HTTPPARSER
* init 12 10 HTTPPARSER
init 13 10 TickObject
init 14 5 TCPCONNECTWRAP
destroy 7
destroy 10
destroy 13
destroy 9
init 15 6 WRITEWRAP
destroy 14
status code - 200
init 16 12 TickObject
init 17 6 TickObject
init 18 6 TickObject
init 19 6 TickObject
init 20 6 TickObject
destroy 15
destroy 16
destroy 17
destroy 18
destroy 19
destroy 20
init 21 6 TickObject
init 22 6 TickObject
init 23 6 TickObject
init 24 6 TickObject
init 25 6 TickObject
init 26 6 TickObject
destroy 21
destroy 22
destroy 23
destroy 24
destroy 25
destroy 26
init 27 6 TickObject
init 28 6 TickObject
init 29 6 TickObject
destroy 27
destroy 28
destroy 29
init 30 6 TickObject
init 31 6 TickObject
init 32 6 TickObject
init 33 6 TickObject
init 34 6 TickObject
init 35 6 TickObject
init 36 6 TickObject
destroy 30
destroy 31
destroy 32
destroy 33
destroy 34
destroy 35
destroy 36
init 37 6 TickObject
init 38 6 TickObject
init 39 6 TickObject
destroy 37
destroy 38
destroy 39
init 40 6 TickObject
init 41 6 TickObject
destroy 40
destroy 41
init 42 6 TickObject
init 43 6 TickObject
init 44 6 TickObject
init 45 6 TickObject
destroy 42
destroy 43
destroy 44
destroy 45

I've annotated the logs above to add an asterisk (*) for each of the init callbacks that do not have a corresponding destroy callback. As you can see the TCPWRAP, TLSWRAP, DNSCHANNEL, HTTPPARSER callback types seem to be the problematic ones.

I'm concerned that this asymmetry will cause memory leaks in the various node modules that use this approach for 'continuation local storage', e.g. https://github.com/Jeff-Lewis/cls-hooked

like image 714
ColinE Avatar asked Nov 06 '22 21:11

ColinE


1 Answers

I have two pieces of advice that can help with this after integrating asynchronous tracking into Data-Forge Notebook.

The first is that you should wrap up the code that you want to have async hooks enabled for in its own parent async resource. Think of this as an async context that separates the code that you do want to track from the code that you don't want to track.

This way you can isolate your async tracking to only the code that really needs it. If you do this then you will purposefully ignore async operations from other code that you don't care about and it could easily be those async operations that are causing your problem, so doing this would remove them from consideration.

Here's some pseudo-code to explain what I mean:

const async_hooks = require("async_hooks");

function initAsyncTracking(trackedAsyncId) {

    const asyncHook = async_hooks.createHook({ // Initialise the async hooks API.
        init: (asyncId, type, triggerAsyncId, resource) => {
            // ... Add the async operation to your tracking if triggerAsyncId is equal to trackedAsyncId (the ID of our tracked parent async operation).
            // ... You also need to track the operation if triggerAsyncId is a child async operation of trackedAsyncId (you need to store enough information in your record to make this check).
        },
        destroy: asyncId => {
            // ... Remove the async operation if it was tracked ...
        },
        promiseResolve: asyncId => {
            // ... Remove the async operation if it was tracked ...
        },
    });

    asyncHook.enable(); // Enable tracking of async operations.
}

// ... code executed here (eg outside the async resource) isn't tracked.

const asyncTrackingResource = new async_hooks.AsyncResource("MY-ASYNC-RESOURCE"); // Create an async resource to be a parent of any async operations we want to track.
asyncTrackingResource.runInAsyncScope(() => {
    const trackedAsyncId = async_hooks.executionAsyncId(); // Get the id of the async resource we created.
    initAsyncTracking(trackedAsyncId );

    // ... code executed here (eg inside the async resource) will be tracked.

});

// ... code executed here (eg outside the async resource) isn't tracked.

The second bit of advice I have is related to the DNSCHANNEL async resource. I found that this async resource was being created lazily and stored in a global variable by the Node.js runtime library. I was able to trigger creation of it through the Node.js request module. So this is a system async resource that gets created indirectly by your code and is cached globally (maybe for performance?).

It's a bit hacky but I found that if I forced creation of a global DNSCHANNEL resource outside of my async tracked code then it was no longer an issue

Here's the code, to pre-create the DNSCHANNEL async resource:

const { Resolver } = require("dns");

const hackWorkaround = new Resolver();

It's a bit ugly but it forces this to get created before my async tracked code runs, therefore it's not a problem that Node.js doesn't seem to ever clean up this particular resource.

It might also be the case that Node.js has other global async resources like this one that could be causing a problem for you. If you find any more please let me know!

like image 104
Ashley Davis Avatar answered Nov 15 '22 06:11

Ashley Davis