Up until V8 10.0, the class field initializers had been broken in the V8 startup snapshot - a feature used by embedders like Node.js and Deno to speed up the bootstrapping of themselves or user applications. This post is going to cover how the issue was fixed.

The issue

Take this class for an example:

1
2
3
4
5
class A {
#a = 0;
func() { }
b = this.#a;
}

When evaluating the class, V8 would collect the class field initializers (#a = 0 and b = this.#a) and generate a synthetic instance member function, with the initializers as the function body. This function is there to simplify the handling of potential scope mismatches. Take this class for an example:

1
2
3
4
5
6
7
8
9
10
class A extends class {} {
#a = 0;
constructor(run) {
const callSuper = () => super();
// ...do something
run(callSuper)
}
};

new A((fn) => fn());

Here a super() call is nested in an arrow function and can be invoked from outside the class, for the specified semantics the initialization would have to be done right after super() returns and thus in a scope that does not belong to the constructor. V8 simply pretends that there is a function scope surrounding the initializers and use the existing machinery in V8 for handling function invocations to handle the scope changes. Since Turbofan can inline this synthetic function when it’s frequently invoked, the cost of this additional function is insignificant when optimization is enabled.

In V8, each function in the source code is associated with a SharedFunctionInfo (SFI) object which contains information that is shared by different JSFunction objects that are generated for different evaluations of the same function. The first time this class is evaluated, V8 would parse the class, synthesize the AST of the initializer, generates bytecodes for it and creates a SharedFunctionInfo object holding a reference to the bytecodes as well as a JSFunction object holding a reference to this SharedFunctionInfo. V8 also generates a JSFunction object for the constructor and stores a reference to the initializer function’s JSFunction in the constructor’s JSFunction. Unlike the initializer function, however, the constructor is lazily compiled so at this time it’s only pre-parsed. Instead of generating bytecodes, V8 only generates a small UncompiledData object which can be used later during the actual compilation for the SharedFunctionInfo to hold on to.

When the constructor is invoked for the first time, V8 parses and compiles bytecodes for the constructor, then executes those bytecodes. During the execution of the constructor, the initializer function’s JSFunction object is loaded from the constructor and invoked. V8 then locates the bytecodes of the initializer, which is already generated at class evaluation time, and executes them to complete the initialization of the instance.

In most cases, the JSFunction object for the synthetic initializer function and the bytecode it references through its SharedFunctionInfo stay alive as long as the constructor’s JSFunction is alive, so there would be no need to recompile it. But when certain V8 features are used - for example, when the class is included in a startup snapshot generated with FunctionCodeHandling::kClear which excludes the bytecodes from the snapshot - the bytecodes can be lost and needs be recompiled.

How the initializer is compiled

And this is where the problem comes in. Previously, the body of the synthetic function spans from the start of the first class field initializer to the end of the last class field initializer, so with a class like this:

1
2
3
4
5
class A {
#a = 0;
func() { }
b = this.#a;
}

The body of the synthetic initializer function is effectively this:

1
2
3
#a = 0;
func() { }
b = this.#a;

while normally, the parser expects a function body to include the parenthesis and the braces, like this:

1
2
3
4
5
() {
#a = 0;
func() { }
b = this.#a;
}

And that’s why attempting to invoke the constructor of a class with fields after it’s deserialized from a snapshot generated using FunctionCodeHandling::kClear throws a SyntaxError. In addition, the initializers do not follow the usual syntax of property stores (e.g. #a = 1 v.s. this.#a = 1), so attempting to parse them as normal statements would fail.

This was a problem for embedders like Node.js and Deno, because they have their own version of embedded snapshots and support snapshots of user-land applications. This issue prevented them from using class fields in their bootstrap code or accepting users’ snapshot entry point scripts with class fields.

The fix

To support recompilation of the synthetic initializer function, we patched the SharedFunctionInfo allocated for it to include source positions between which the entire class spans, so now the body of this function is the same as the class body:

1
2
3
4
5
class A {
#a = 1;
func() { }
#b = 2;
}

When V8 needs to recompile a function, it rewinds the scanner to the start position stored in the SharedFunctionInfo associated with the function. With our fix, when recompiling the initializer function, V8 now rewinds to where the class token begins for the initializer function. The fix then checks a bitfield of the SharedFunctionInfo to see if the function is a synthetic class initializer. If that’s the case, it reparses the function body as a class, and collects the initializers to re-synthesize a initializer function for which the bytecode generator knows how to emit bytecodes. To avoid unnecessary costs, the class members that are not field initializers (e.g. class methods like func() in the snippet above) are only pre-parsed and V8 does not actually generate an AST for them. The reparsed AST of the initializer function only contains what’s necessary to recompile the initialization code.

When the class has private or computed fields, the AST of the initializer function contains synthetic variables referencing slots of the class context. During the initial compilation, a scope analysis is performed to determine the exact slots that these variables reference to. For example, the bytecode generated for the initializer function of the class above looks like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
// Load the private name symbol for `#a` into r1
LdaImmutableCurrentContextSlot [2]
Star0

// Use the DefineKeyedOwnProperty bytecode to store 0 as the value of
// the property keyed by the private name symbol `#a` in the instance,
// that is, `#a = 0`.
LdaZero
DefineKeyedOwnProperty <this>, r0, [0]

// Load the private name symbol for `#a`
LdaImmutableCurrentContextSlot [2]

// Load the value of the property keyed by `#a` from the instance into r2
GetKeyedProperty <this>, [2]

// Use the DefineKeyedOwnProperty bytecode to store the property keyed
// by `#a` as the value of the property keyed by `b`, that is, `b = this.#a`
DefineNamedOwnProperty <this>, [0], [4]

When generating the LdaImmutableCurrentContextSlot bytecode to load the private name symbols from the context, the slot index taken by the bytecode as the operand is determined by the scope analysis and therefore known at bytecode-generation time. If the function needs a context, V8 allocates a corresponding ScopeInfo object for it which includes the result of the scope analysis. For the reparsed initializer function, instead of doing a scope analysis again, we can simply restore the result of the previous scope analysis using the ScopeInfo of the outer class scope.

There is a caveat, however. If the class has only named public fields, it does not need a context, and V8 does not allocate a ScopeInfo for the class scope. With the following class:

1
2
3
class A {
#a = 1;
}

The presence of private field #a requires a private name symbol stored in the class context, so a context is going to be allocated for the class and so does the corresponding ScopeInfo. The scope chain is the same before and after serialization of the ScopeInfos and it looks like this:

1
2
3
[ scope of initializer function]
-> [ scope of class A ] // We'll use this to restore scope analysis results
-> null (global scope)

With this class, however:

1
2
3
class B {
b = 1;
}

The initialization of named public fields does not need anything from the context, so there would be no context or ScopeInfo created for the class. In this case, the original scope chain looks like this:

1
2
3
[ scope of initializer function]
-> [ scope of class B ]
-> null (global scope)

But the serialized scope chain looks like this (notice that the scope of the class B is optimized away). As a result, we cannot use the outer scope in the deserialized scope chain to restore results of the previous scope analysis.

1
2
[ scope of initializer function]
-> null (global scope)

To detect the second case, V8 checks the source positions of the initializer function’s SharedFunctionInfo and the source positions of the outer ScopeInfo. If they don’t match or if there isn’t an outer ScopeInfo at all, then V8 knows that previous scope analysis has determined that nothing in the outer scope is referenced by the initializer scope or its inner scopes, so it will just leave the scope as-is without attempting to restore anything from the ScopeInfo, and proceeds to bytecode generation.