Fixing Node.js vm APIs, part 4 - hitting the compilation cache again
In the last post I wrote about how I finally managed to fix the memory problems in the vm
APIs, and it turned out that there was another issue blocking users from upgrading away from the End-of-Life Node.js v16 - vm.Script
compilation could be ~100x slower in v16 in some cases. In this post, let’s look at what was causing the performance regression, and the fix/workaround developed for it.
The performance regression
The regression could be reproduced with this snippet - this should be run in the Node.js source directory, where test/fixtures/snapshot/typescript.js
contains a built of the TypeScript library that’s as big as 10MB, since we need a giant JavaScript file to make the regression really show:
1 | const { readFileSync } = require('fs'); |
On Node.js versions <= 16.10.0, compiling this big file for 100 times took about 56ms on a beefy laptop, while on Node.js versions >= 16.11.0, this took about 5600ms - so it appeared that there was a 100x performance regression in repeated vm.Script
compilation.
Analysis of the regression
As mentioned in the previous posts, compilation via the vm.Script
basically comes down to compiling a v8::UnboundScript
. This usually involves parsing the given source code and generating bytecode for the top-level JavaScript code (bytecode for inner functions is usually only generated when the inner functions are actually invoked). As fast as V8’s parser and bytecode generator could be, compiling a 10MB script for 100 times for real would definitely take more than 56ms. Then why was it so fast prior to v16.11.0? Actually, in the example above, V8 used to only compile the script for real when vm.Script
was invoked for first time (which took about 54ms). The other 99 compilations were not actually done from scratch, and they only took about 2ms in total. So technically the example above looked ~100x (or, 99x) slower because for the 99 subsequent compilations, we were almost comparing “compiling 99 scripts for real” v.s. “99 hash table lookups”. If we increased the iteration to n, it would even appear to be n - 1 times slower.
Now let’s look at how the compilation cache works. When V8 successfully compiles a script, it stores the corresponding v8::internal::SharedFunctionInfo
inside a compilation cache table. v8::internal::SharedFunctionInfo
is the internal equivalent to v8::UnboundScript
and references the v8::internal::Script
we mentioned in previous posts. The next time V8 is told to compile a script, it first looks up this table to see if there is already a compiled script with the same source code and the same metadata. The metadata includes, among others, the host-defined options. If V8 can find a script with matching source and matching metadata in the cache, it would just return the already compiled script; otherwise it would compile a new script from script, and put it into the cache for later lookups.
Prior to v16.11.0, Node.js did not configure any host-defined options to vm.Script
, so the compilation cache could be hit, and the subsequent 99 compilations reduced to just 99 hash table lookups, which as we saw could be incredibly fast. Later, the importModuleDynamically
option was introduced in Node.js 16, and the implementation of it configured the host-defined options for every vm.Script
even if users do not use the importModuleDynamically
at all. As a result, compiling the same source code using vm.Script
no longer hit the compilation cache, so it became significantly slower, since V8 would have to compile the scripts for real every time. In addition, without hitting the compilation cache, V8 had to regenerate a script every time and put it in the cache, so the cache would be full of the scripts with the same source code but slightly different host-defined options, until the scripts proved to be useless after 4 non-forced garbage collection cycles. This could lead to a significant waste of memory.
After the memory issues in vm
was fixed, I got a lot of requests asking for help with this issue. Apparently this regression turned out to be critical for part of the Node.js ecosystem. Among others, Jest users were quite heavily hit by this issue, since Jest used vm.Script
a lot and relied on the cache hits to be performant. If we let this go, many users would get stuck with the now End-of-Life v16 for an indefinite amount of time. So it seemed urgent enough to develop a fix/workaround for it.
Working around the performance regression
Recall that I mentioned:
Later, the
importModuleDynamically
option was introduced in Node.js 16, and the implementation of it configured the host-defined options for everyvm.Script
even if users do not use theimportModuleDynamically
at all.
While it could be tricky to fix the cache miss when importModuleDynamically
was actually used, that was not actually necessary to unblock most users in the upgrade, because importModuleDynamically
was only part of an experimental API. The majority of users complaining were not actually using that option. So at least we could first fix the regression for the case where importModuleDynamically
was unused.
Now, after the memory fixes, we had switched to symbols as the host-defined options identifying different importModuleDynamically
settings. It occurred to me that we could just use a constant symbol to indicate that importModuleDynamically
was not even configured by users. Then all the vm.Script
compilations not using importModuleDynamically
could have the same host-defined options, and hit the compilation cache again.
So I opened another PR that implemented this idea and that indeed made the regression go away for the simple reproduction above. But for Jest users, this was not enough, because Jest
actually did use this callback, even though when users did not use import()
in the tests, it was only emitting a error saying that --experimental-vm-modules
was not configured to enhance the debugging experience. Since we thought it should be impossible to return anything useful in the importModuleDynamically
callback when --experimental-vm-modules
was not enabled, I added another constant symbol to implement this error internally, and made it a requirement for users to use --experimental-vm-modules
if they intend to use importModuleDynamically
- this probably should’ve been done when importModuleDynamically
was introduced as an experimental API anyway, and it was not too late to do it then.
Releasing and verifying the patches
Now with the memory issues and the performance regressions gone, at least users who do not actually need import()
support in the vm
API can upgrade to newer versions of Node.js. The patches were backported v21 and v20, and we got verification from Jest users that the memory consumption and test run time improved significantly after updating to the versions with the patches (1 2 3). So the really long Jest issue caused by the regressions could finally be closed.
It’s not certain whether the fixes would be released in v18 - this is up to the discretion of the release team, because v18 is now in maintenance mode - though I’ve done the backport anyway.
What’s next?
As mentioned the patches for the performance regression were mostly workarounds. When --experimental-vm-modules
goes out of experimental status, or when the flag and importModuleDynamically
are actually used, the regression would come back again. There are a few things that could be done to fix the regressions for more use cases:
-
Recall that:
…we thought it should be impossible to return anything useful in the importModuleDynamically callback when
--experimental-vm-modules
was not enabled…This assumption turned out to be incorrect. Users could still do:
1
function importModuleDynamically(specifier) { return import(specifier); }
to polyfill using the default loader in the main context, instead of using any of the
vm
APIs enabled by--experimental-vm-modules
. In this case it seems unnecessarily strict to require the use of--experimental-vm-modules
. One workaround would be to simply add a global option that implements this behavior, which is probably useful on its own, even not for the cache implications. We can use another constant symbol for this global option, thus hitting the compilation cache with this constant. I’ve opened a work-in-progress PR for this. -
A per-script
importModuleDynamically()
option doesn’t really seem to be that necessary for users. When I looked into the bug reports, it seemed most users ofimportModuleDynamically
tended to reuse one callback for many different scripts. So, we could also look into an alternative API that allows users to customize dynamicimport()
handling for a group of scripts, thus allowing repeated compilation in this group to hit the compilation cache. -
While the regression in
vm.Script
could be worked around,vm.compileFunction()
is still not hitting the cache. After some investigation, I realized that there is no compilation cache support for the underlyingScriptCompiler::CompileFunction()
at all -vm.compileFunction()
was a later invention in the V8 API, so this was probably just an unintentional negligence. I uploaded a V8 CL to implement cache support for that. -
Ultimately, if V8 does not cache the entire
v8::internal::SharedFunctionInfo
and only cache the bytecode (which is the most expensive to generate), it does not have to care about host-defined options anymore when hitting the cache. This also happens to be important for the “allowing arbitrary objects as host-defined options” patch that we were waiting for. This requires some internal refactoring in V8, which is being carried out by the V8 team (tracking issue). Once that is completed, how we manage host-defined options would no longer matter for caching, and the cache misses could be fixed automatically.
Final remarks
This has been a interesting ride. I started out working on an internal refactoring and ended up fixing bugs that affected a substantial chunk of the ecosystem. Some lessons I’ve learned along the way:
- It’s useful to learn about random things in the code bases. Even if they don’t seem to matter in what you work on at the moment, they could come back being critical in fixing other stuff the future. This was what happened with my accidental knowledge of the WeakMap semantics, Symbol-as-WeakMap keys proposal, some garbage collection details, or how the internal fields are stored in the V8 embedder objects. I guess this is what people refer to as “experience”. :)
- Don’t get too obsessed with a perfect fix. If a bug is damaging enough that it should be fixed sooner than later, workarounds are fine. I think I sometimes tend to wait for a more theoretically sound solution instead of thinking about a workaround for the time being, which probably also led to the delay of the fixes when others tried looking into the issues, until v16 reached End-of-Life and the issues became more urgent.
Though I did not set out to fix the vm
issues, I would not have had enough time to look into them without support from Bloomberg & Igalia, for which I am very grateful. For other work sponsored by Bloomberg that is related to Node.js, check out this post by Daniel Ehrenberg.