Memory leak regression testing with V8/Node.js, part 2 - finalizer-based testing

In the previous blog post, I talked about how Node.js used memory usage measurement to test against memory leaks. Sometimes that’s good enough to provide valid tests. Sometimes we want the test to be more precises and focus on the status of specific objects. This can be quite tricky with what’s available to interact with V8’s garbage collector.

Weak callback + `gc()`

One common strategy used by Node.js core test suites relies on the native v8::PersistentBase::SetWeak() API to invoke a “finalizer” when the observed object is garbage collected. In the Node-API this is abstracted as napi_add_finalizer(). The testing procedure goes like this:

Register a process.on('exit') callback to check that the finalizer for the observed is invoked as expected
Repeatedly allocate the objects that are susceptible to leak, and, every time a new object is allocated, register a finalizer for it, which tracks whether/how many times the finalizer is called
At process exit, if the callback set in 1 sees that the finalizer has not been invoked for enough times, the object is considered leaking.

An example from the Node.js core test suite, which I’ve simplified for demonstration purposes, looks like below:

// Flags: --expose-gc
const assert = require('assert');
const http = require('http');

const max = 100;
let called = 0;

function ongc() { called++; }
process.on('exit', () => { assert.strictEqual(called, max); });

// Checks that server that doesn't listen can still be GC'ed.
for (let i = 0; i < max; i++) {
  const server = http.createServer((req, res) => {});
  onGC(server, { ongc });
}

setImmediate(() => {
  global.gc();
});

The onGC() helper was introduced before the FinalizationRegistry API became available to JavaScript. It essentially serves the same purpose as FinalizationRegistry and invokes the ongc() callback for the first argument as a finializer. It is implemented with Node.js’s destroy async hook which is in turn implemented with the v8::PersistentBase::SetWeak() API mentioend before. A simplified version of the onGC() helper should look like this:

const async_hooks = require('async_hooks');
const gcTrackerMap = new WeakMap();
function onGC(obj, gcListener) {
  const onGcAsyncHook = async_hooks.createHook({
    init(id, type) {
      if (this.trackedId === undefined) {
        this.trackedId = id;
      }
    },
    destroy(id) {
      if (id === this.trackedId) {
        this.gcListener.ongc();
        onGcAsyncHook.disable();
      }
    },
  }).enable();
  onGcAsyncHook.gcListener = gcListener;

  // Link the lifetime of an async resource with obj.
  // When obj is garbage collected, resource can be
  // garbage collected too, and when resource is gone
  // the destroy hook would be triggered to call
  // gcListener.ongc().
  const resource = new async_hooks.AsyncResource('GC');
  gcTrackerMap.set(obj, resource);
  obj = null;  // Don't keep obj alive in the closure.
}

`FinalizationRegistry` + `gc()`

The FinalizationRegistry API (part of the WeakRef proposal) has been shipped since V8 8.4. This roughly serves the same purpose as the onGC() helper described above, but the callbacks are invoked via a mechanism different from that of the weak callback’s. Compared to weak callbacks, the invocation of finalization registry callbacks usually happens later and is less predictable. This is by-design to give JS engines more leeway in the scheduling of the callback and avoid hurting performance. Technically the JS engine does not even have to invoke the callback (the same can also be said about weak callbacks, but they are less complex anyway). To quote the proposal explainer:

Finalizers are tricky business and it is best to avoid them. They can be invoked at unexpected times, or not at all…
The proposed specification allows conforming implementations to skip calling finalization callbacks for any reason or no reason.

For example, if we migrate the test above to use FinalizationRegistry, it would have looked like this:

const assert = require('assert');
const http = require('http');

const max = 100;
let called = 0;

function ongc() { called++; }
process.on('exit', () => { assert.strictEqual(called, max); });

const f = new FinalizationRegistry(ongc);
for (let i = 0; i < max; i++) {
  const server = http.createServer(() => {});
  f.register(server);
}

// Here we must do gc() before run a setImmediate() to
// keep the event loop running for at least another
// iteration, otherwise no tasks scheduled for the finalization
// registry callback would have a chance to run before
// process shutdown.
global.gc();
setImmediate(() => {});

In practice though, the callback would only be called for 99 times by the time the exit event is emitted - at least when I tested it locally. As I’ve analyzed in another blog post, the false positives of Jest’s --deteck-leaks (which is based on FinalizatioRegistry) showed that you cannot use gc() to ensure finalization registry callbacks to be called for every object ever registered when they are garbage collected, even if you go as far as running gc() for 10 times asynchronously, because that’s not what they are designed for in the first place. A more flake proof test case can change this line:

1	process.on('exit', () => { assert.strictEqual(called, max); });

To this:

1	process.on('exit', () => { assert(called > 0); });

Ultimately, this depends on the regression that you are testing against. If the leak reproduces reliably with every repeated operation that you are testing, one non-leaking sample may already give you 90% confidence that you’ve fixed it and it’s not regressing again. Of course, you might want a 100% confidence and confirm this with every sample, but given that observing finalization with a garbage collector can already give you false positives by design, a less precise test with less false positives is better than a more precise test with more false positives.

Abusing heap snapshots for a more aggressive GC

As I’ve talked about in the other blog post, a simple gc() is normally not enough to clean up as many objects and invoke as many callbacks as possible, because it’s simply not designed for that. Running it multiple times or keeping the thread running for a bit (in Node.js, using setImmediate() to keep the event loop alive) can sometimes give V8 enough nudges to run your finalizers for unreachable objects (which was what Jest’s --detect-leaks did), but sometimes those tricks are still not enough. In that case, if you count on the finalizers to tell you whether your object can be collected or not, and consider the absence of finalizer invocations to be an indication of leaks, then you are going to have false positives.

There is another caveat with gc() - if the graph being checked involves newly compiled functions/scripts, and you are assuming that V8 can collect them when they are not reachable by users (which does happen normally), then the use of gc() can bite you in the back because a forced GC induced by gc() alone can prevent them from being garbage collected. That’s intentional, because gc() is a V8 internal API that only caters to V8’s own testing needs, which includes this behavior.

That said, sometimes it’s still inevitable for the regression tests to force the garbage collection somehow. Is there a more reliable alternative to gc()? Well, one hack used by some of Node.js’s tests as well as a later fix to Jest’s --detect-leaks is to take a heap snapshot to perform some some kind of last-resort garbage collection. By design, a heap snapshot in intended to capture what’s alive on the heap as accurately as possible, so taking it urges V8 to start the garbage collection with some additional operations to run as many finalizers as it could. The heap snapshot generation process also clears the compilation cache, which can help clearing scripts that would not be otherwise collected if the GC is forced by gc().

Take this other helper added to Node.js’s code base for example (simplified below):

async function checkIfCollectable(fn, maxCount, generateSnapshotAt) {
  let anyFinalized = false;
  let count = 0;

  const f = new FinalizationRegistry(() => {
    anyFinalized = true;
  });

  async function createObject() {
    const obj = await fn();
    f.register(obj);
    if (count++ < maxCount && !anyFinalized) {
      setImmediate(createObject, 1);
    }
    // This can force a more thorough GC, but can slow the test down
    // significantly in a big heap. Use it with care.
    if (count % generateSnapshotAt === 0) {
      require('v8').getHeapSnapshot().pause().read();
    }
  }

  createObject();
}

This helper takes an object factory fn(), and run it up to maxCount times. Ideally the heap size limit should also be set to a smaller value to give V8 some sense of emergency to clean the constructed objects up as the allocation happens. If the FinalizationRegistry callback for any objects returned from fn() gets called during the process, we know that at least some of these objects are collectable under memory pressure, then we are confident enough about disproving the leak and stop there. To give V8 extra nudges to invoke the finalizer, we’ll also take the heap snapshot at a specified frequency.

To use this helper, take this test for example (for detailed breakdown of this bug, check out the other blog post):

// Flags: --max-old-space-size=16 --trace-gc
const vm = require('vm');

// Tests that vm.Script compiled with a custom
// importModuleDynamically() callback doesn't leak.
async function createContextifyScript() {
  // Try to reach the maximum old space size.
  return new vm.Script(`"${Math.random().toString().repeat(512)}";`, {
    async importModuleDynamically() {},
  });
}
checkIfCollectable(createContextifyScript, 2048, 512);

This approach kept the flakes out of the CI for a while…until Node.js updated V8 to a newer version.

Next: heap iteration-based testing

In the next blog post, I will talk about another more reliable strategy that was used to fix with the flakes coming from the newer version of V8.

Memory leak regression testing with V8/Node.js, part 2 - finalizer-based testing

Memory leak regression testing with V8/Node.js, part 2 - finalizer-based testing

Weak callback + gc()

FinalizationRegistry + gc()

Abusing heap snapshots for a more aggressive GC

Next: heap iteration-based testing

Weak callback + `gc()`

`FinalizationRegistry` + `gc()`