This document is intended as a reference for new members of the Dart VM team, potential external contributors or just anybody interested in VM internals. It starts with a high-level overview of the Dart VM and then proceeds to describe various components of the VM in more details.
Source: https://mrale.ph/dartvm/
Dart VM is a collection of components for executing Dart code natively. Notably it includes the following:
- Runtime System
- Object Model
- Garbage Collection
- Snapshots
- Core libraries native methods
- Development Experience components accessible via service protocol * Debugging * Profiling * Hot-reload
- Just-in-Time (JIT) and Ahead-of-Time (AOT) compilation pipelines
- Interpreter
- ARM simulators
The name "Dart VM" is historical. Dart VM is a virtual machine in a sense that it provides an execution environment for a high-level programming language, however it does not imply that Dart is always interpreted or JIT-compiled, when executing on Dart VM. For example, Dart code can be compiled into machine code using Dart VM AOT pipeline and then executed within a stripped version of the Dart VM, called precompiled runtime, which does not contain any compiler components and is incapable of loading Dart source code dynamically.
How does Dart VM run your code?
Dart VM has multiple ways to execute the code, for example:
- from source or Kernel binary using JIT;
- from snapshots:
- from AOT snapshot;
- from AppJIT snapshot.
However the main difference between these lies in when and how VM converts Dart source code to executable code. The runtime environment that facilitates the execution remains the same.
pseudo isolate for
shared immutable objects
like null, true, false.
โโโโโโโโโโโโโโ
โ VM Isolate โ heaps can reference
โ โญโโโโโโโโโฎ โ vm-isolate heap.
โโโโโโโโโโโถโ Heap โโโโโโโโโโโโโโโโโโโ
โ โ โฐโโโโโโโโโฏ โ โ
โ โโโโโโโโโโโโโโ โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ IsolateGroup โ โ โ IsolateGroup โ โ
โ โ โ โ โ โ
โ โญโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ โ โ โญโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ โ
โ โ GC managed Heap โ
โ
โ
โ
โ
โ
โ
โ
โณโ
โ
โ
โ
โ
โ
โ
โถโ GC managed Heap โ โ
โ โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ โ no cross โ โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ โ
โ โโโโโโโโโโโ โโโโโโโโโโโ โ group โ โโโโโโโโโโโ โโโโโโโโโโโ โ
โ โโโโโโโโโโโโ โโโโโโโโโโโโ โ references โ โโโโโโโโโโโโ โโโโโโโโโโโโ โ
โ โโโโโโโโโโโโโ โโโโโโโโโโโโโ โ โ โโโโโโโโโโโโโ โโโโโโโโโโโโโ โ
โ โโโIsolate โ โโโ โ โ โ โโโIsolate โ โโโ โ โ
โ โโโ โ โโโ โ โ โ โโโ โ โโโ โ โ
โ โโโ globals โ โโโ helper โ โ โ โโโ globals โ โโโ helper โ โ
โ โโโ โ โโโ thread โ โ โ โโโ โ โโโ thread โ โ
โ โโโ mutator โ โโโ โ โ โ โโโ mutator โ โโโ โ โ
โ โโ thread โ โโ โ โ โ โโ thread โ โโ โ โ
โ โโโโโโโโโโโ โโโโโโโโโโโ โ โ โโโโโโโโโโโ โโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Any Dart code within the VM is running within some isolate, which can be best described as an isolated Dart universe with its own global state and usually with its own thread of control (mutator thread). Isolates are grouped together into isolate groups. Isolate within the group share the same garbage collector managed heap, used as a storage for objects allocated by an isolate. Heap sharing between isolates in the same group is an implementation detail which is not observable from the Dart code. Even isolates within the same group can not share any mutable state directly and can only communicate by message passing through ports (not to be confused with network ports!).
Isolates within a group share the same Dart program. Isolate.spawn
spawns an isolate within the same group, while Isolate.spawnUri
starts a new group.
The relationship between OS threads and isolates is a bit blurry and highly dependent on how VM is embedded into an application. Only the following is guaranteed:
- an OS thread can enter only one isolate at a time. It has to leave current isolate if it wants to enter another isolate;
- there can only be a single mutator thread associated with an isolate at a time. Mutator thread is a thread that executes Dart code and uses VM's public C API.
However the same OS thread can first enter one isolate, execute Dart code, then leave this isolate and enter another isolate. Alternatively many different OS threads can enter an isolate and execute Dart code inside it, just not simultaneously.
In addition to a single mutator thread an isolate can also be associated with multiple helper threads, for example:
- a background JIT compiler thread;
- GC sweeper threads;
- concurrent GC marker threads.
Internally VM uses a thread pool ThreadPool
to manage OS threads and the code is structured around ThreadPool::Task
concept rather than around a concept of an OS thread. For example, instead of spawning a dedicated thread to perform background sweeping after a GC VM posts a ConcurrentSweeperTask
to the global VM thread pool and thread pool implementation either selects an idling thread or spawns a new thread if no threads are available. Similarly the default implementation of an event loop for isolate message processing does not actually spawn a dedicated event loop thread, instead it posts a MessageHandlerTask
to the thread pool whenever a new message arrives.
Isolate
represents an isolate, IsolateGroup
an isolate group and class Heap
- isolate group's heap. Class Thread
describes the state associated with a thread attached to an isolate. Note that the name Thread
is somewhat confusing because all OS threads attached to the same isolate as a mutator will reuse the same Thread
instance. See Dart_RunLoop
and MessageHandler
for the default implementation of an isolate's message handling.Running from source via JIT.
This section tries to cover what happens when you try to execute Dart from the command line:
// hello.dart
main() => print('Hello, World!');
$ dart hello.dart
Hello, World!
Since Dart 2 VM no longer has the ability to directly execute Dart from raw source, instead VM expects to be given Kernel binaries (also called dill files) which contain serialized Kernel ASTs. The task of translating Dart source into Kernel AST is handled by the common front-end (CFE) written in Dart and shared between different Dart tools (e.g. VM, dart2js, Dart Dev Compiler).
โญโโโโโโโโโโโโโโฎ โญโโโโโโโโโโโโโฎ
โโญโโโโโโโโโโโโโโฎ โโโโโโโ โโญโโโโโโโโโโโโโฎ โโโโโโ
โโโญโโโโโโโโโโโโโโฎโฃโโโโถ โ CFE โ โฃโโโโถ โโโญโโโโโโโโโโโโโฎ โฃโโโโถ โ VM โ
โโโ Dart Source โ โโโโโโโ โโโ Kernel AST โ โโโโโโ
โโโ โ โฐโโ (binary) โ
โโ โ โฐโ โ
โ โ โฐโโโโโโโโโโโโโฏ
To preserve convenience of executing Dart directly from source standalone dart
executable hosts a helper isolate called kernel service, which handles compilation of Dart source into Kernel. VM then will run resulting Kernel binary.
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ dart (cli) โ
โ โโโโโโโโโโโ โโโโโโโโโโโ โ
โญโโโโโโโโโโโโโโฎ โ โ kernel โ โญโโโโโโโโโโโโโฎ โ main โ โ
โโญโโโโโโโโโโโโโโฎ โ โ service โ โโญโโโโโโโโโโโโโฎ โ isolate โ โ
โโโญโโโโโโโโโโโโโโฎโฃโโโโถโ isolate โโฃโโโโถโโโญโโโโโโโโโโโโโฎโฃโโโโถโ โ โ
โโโ Dart Source โ โ โ โ โโโ Kernel AST โ โ โ โ
โโโ โ โ โโโโโโโโโ โ โฐโโ (binary) โ โ โ โ
โโ โ โ โโ CFE โ โ โฐโ โ โ โ โ
โ โ โ โโโโโโโโโ โ โฐโโโโโโโโโโโโโฏ โ โ โ
โ โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ โ
โ โโโโโโโโโโโ VM โโโโโโโโโโโ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
However this setup is not the only way to arrange CFE and VM to run Dart code. For example, Flutter completely separates compilation to Kernel and execution from Kernel by putting them onto different devices: compilation happens on the developer machine (host) and execution is handled on the target mobile device, which receives Kernel binaries send to it by flutter
tool.
HOST โ DEVICE
โ
โโโโโโโโโโโโโโโโโโโโโโโโ โ โโโโโโโโโโโโโโโโโโ
โญโโโโโโโโโโโโโโฎ โfrontend_server (CFE) โ โ โ Flutter Engine โ
โโญโโโโโโโโโโโโโโฎ โโโโโโโโโโโโโโโโโโโโโโโโ โ โ โ โโโโโโโโโโโโโโ โ
โโโญโโโโโโโโโโโโโโฎโฃโโโโถโflutter run --debug โ โ โ โ โ VM โ โ
โโโ Dart Source โ โ โโโ โ โ โโโโโโโโโโโโโโ โ
โโโ โ โ โ โ โโโโโโโโโโโโโโโโโโ
โโ โ โโโโโโโโโโโโโโโโโโโโโโโโ โ โฒ
โ โ โณ โ โ
โ โ โ
โ โญโโโโโโโโโโโโโฎ โ
โ โโญโโโโโโโโโโโโโฎ โ
โ โโโญโโโโโโโโโโโโโฎ โ
โโโโโโโโโโถโโโ Kernel AST โโฃโโโโ
โฐโโ (binary) โ
โฐโ โ
โฐโโโโโโโโโโโโโฏ
Note that flutter
tool does not handle parsing of Dart itself - instead it spawns another persistent process frontend_server
, which is essentially a thin wrapper around CFE and some Flutter specific Kernel-to-Kernel transformations. frontend_server
compiles Dart source into Kernel files, which flutter
tool then sends to the device. Persistence of the frontend_server
process comes into play when developer requests hot reload: in this case frontend_server
can reuse CFE state from the previous compilation and recompile just libraries which actually changed.
Once Kernel binary is loaded into the VM it is parsed to create objects representing various program entities. However this is done lazily: at first only basic information about libraries and classes is loaded. Each entity originating from a Kernel binary keeps a pointer back to the binary, so that later more information can be loaded as needed.
KERNEL AST BINARY โ ISOLATE GROUP HEAP
โ
โญโโโโโโโโโโโโโโโโโโฎ โ โโโโโโโโโ
โ โ โ โโโฅ Class โ
โโโโโโโโโโโโโโโโโโโค โ โ โโโโโโโโโโฒ heap objects
AST node โ(Class โโโโโ representing
representing โ (Field) โ โ โโโโโโโโโโฑ a class
a class โ (Procedure โ โ โโโฅ Class โ
โ (FunctionNode))โ โ โ โโโโโโโโโ
โ (Procedure โ โ โ
โ (FunctionNode))โ โ โโฒ
โโโโโโโโโโโโโโโโโโโค โ โ โฒ heap objects representing
โ(Class โโโโโ program entities keep
โ (Field) โ โ pointers back into kernel
โ (Field) โ โ binary blob and are
โ (Procedure โ โ deserialized lazily
โ (FunctionNode))โ โ
โโโโโโโโโโโโโโโโโโโค โ
โ
Xyz
in the header runtime/vm/object.h
defines C++ methods, while class UntaggedXyz
in the header runtime/vm/raw_object.h
defines memory layout. For example, Class
and UntaggedClass
specify a VM object describing Dart class, Field
and UntaggedField
specify a VM object describing a Dart field within a Dart class and so on. We will return to this in a section covering runtime system and object model. We omit Untagged...
prefix from illustrations to make them more compact.Information about the class is fully deserialized only when runtime later needs it (e.g. to lookup a class member, to allocate an instance, etc). At this stage class members are read from the Kernel binary. However full function bodies are not deserialized at this stage, only their signatures.
KERNEL AST BINARY โ ISOLATE GROUP HEAP
โ
โญโโโโโโโโโโโโโโโโโโฎ โ โโโโโโโโโ
โ โ โ โโโฅ Class โโโโโโโโโ
โโโโโโโโโโโโโโโโโโโค โ โ โโโโโโโโโ โ
โ(Class โโโโโ โโโโโโโโโโโโ โ
โ (Field) โโโโโโโโโฅ Field โโโโซ
โ (Procedure โโโโโโโ โโโโโโโโโโโโ โ
โ (FunctionNode))โ โ โ โโโโโโโโโโโโ โ
โ (Procedure โโโโโโโโโฅ Function โโโโซ
โ (FunctionNode))โ โ โ โโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโค โ โ โโโโโโโโโโโโ โ
โ(Class โ โ โโโโฅ Function โโโโ
โ (Field) โ โ โโโโโโโโโโโโ
โ (Field) โ โ
โ (Procedure โ โ
โ (FunctionNode))โ โ
โโโโโโโโโโโโโโโโโโโค โ
โ
At this point enough information is loaded from Kernel binary for runtime to successfully resolve and invoke methods. For example, it could resolve and invoke main
function from a library.
package:kernel/ast.dart
defines classes describing the Kernel AST. package:front_end
handles parsing Dart source and building Kernel AST from it. kernel::KernelLoader::LoadEntireProgram
is an entry point for deserialization of Kernel AST into corresponding VM objects. pkg/vm/bin/kernel_service.dart
implements the Kernel Service isolate, runtime/vm/kernel_isolate.cc
glues Dart implementation to the rest of the VM. package:vm
hosts most of the Kernel based VM specific functionality, e.g various Kernel-to-Kernel transformations.If you are interested in Kernel format and its VM specific usage, then you can use pkg/vm/bin/gen_kernel.dart
to produce a Kernel binary file from Dart source. Resulting binary can then be dumped using pkg/vm/bin/dump_kernel.dart
.
- Compile
hello.dart
tohello.dill
Kernel binary using CFE
$ dart pkg/vm/bin/gen_kernel.dart \
--platform out/ReleaseX64/vm_platform_strong.dill \
-o hello.dill \
hello.dart
- Dump textual representation of Kernel AST.
$ dart pkg/vm/bin/dump_kernel.dart hello.dill hello.kernel.txt
When you try using gen_kernel.dart
you will notice that it requires something called platform, a Kernel binary containing AST for all core libraries (dart:core
, dart:async
, etc). If you have Dart SDK build configured then you can just use platform file from the out
directory, e.g. out/ReleaseX64/vm_platform_strong.dill
. Alternatively you can use pkg/front_end/tool/_fasta/compile_platform.dart
to generate the platform:
$ dart pkg/front_end/tool/_fasta/compile_platform.dart \
dart:core \
sdk/lib/libraries.json \
vm_outline.dill vm_platform.dill vm_outline.dill
Initially all functions have a placeholder instead of an actually executable code for their bodies: they point to LazyCompileStub
, which simply asks runtime system to generate executable code for the current function and then tail-calls this newly generated code.
โโโโโโโโโโโโ
โ Function โ
โ โ LazyCompileStub
โ code_ โโโโโโโถ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โ โ code = CompileFunction(...) โ
โโโโโโโโโโโโ โ return code(...); โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
When the function is compiled for the first time this is done by unoptimizing compiler.
Kernel AST Unoptimized IL Machine Code
โญโโโโโโโโโโโโโโโฎ โญโโโโโโโโโโโโโโโโโโโฎ โญโโโโโโโโโโโโโโโโโโโโโโฎ
โ FunctionNode โ โ LoadLocal('a') โ โ push [rbp + ...] โ
โ โ โ LoadLocal('b') โ โ push [rbp + ...] โ
โ (a, b) => โ โฃโโโถ โ InstanceCall('+')โ โฃโโโถ โ call InlineCacheStubโ
โ a + b; โ โ Return โ โ retq โ
โฐโโโโโโโโโโโโโโโฏ โฐโโโโโโโโโโโโโโโโโโโฏ โฐโโโโโโโโโโโโโโโโโโโโโโฏ
Unoptimizing compiler produces machine code in two passes:
- Serialized AST for the function's body is walked to generate a control flow graph (CFG) for the function body. CFG consists of basic blocks filled with intermediate language (IL) instructions. IL instructions used at this stage resemble instructions of a stack based virtual machine: they take operands from the stack, perform operations and then push results to the same stack.
- resulting CFG is directly compiled to machine code using one-to-many lowering of IL instructions: each IL instruction expands to multiple machine language instructions.
There are no optimizations performed at this stage. The main goal of unoptimizing compiler is to produce executable code quickly.
This also means that unoptimizing compiler does not attempt to statically resolve any calls that were not resolved in Kernel binary, so calls (MethodInvocation
or PropertyGet
AST nodes) are compiled as if they were completely dynamic. VM currently does not use any form of virtual table or interface table based dispatch and instead implements dynamic calls using inline caching.
The core idea behind inline caching is to cache results of method resolution in a call site specific cache. Inline caching mechanism used by the VM consists of:
- a call site specific cache (
UntaggedICData
object) that maps receiver's class to a method, that should be invoked if receiver is of a matching class. The cache also stores some auxiliary information, e.g. invocation frequency counters, which track how often the given class was seen at this call site; - a shared lookup stub, which implements method invocation fast path. This stub searches through the given cache to see if it contains an entry that matches receiver's class. If the entry is found then stub increments the frequency counter and tail call cached method. Otherwise stub invokes a runtime system helper which implements method resolution logic. If method resolution succeeds then cache is updated and subsequent invocations will not need to enter runtime system.
The picture below illustrates the structure and the state of an inline cache associated with animal.toFace()
call site, which was executed twice with an instance of Dog
and once with an instance of a Cat
.
class Dog {
get face => '๐ถ';
}
class Cat {
get face => '๐ฑ';
} ICData
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
sameFace(animal, face) โโโโโโโโโโโถโ// class, method, frequency โ
object.face == face; โ โ[Dog, Dog.get:face, 2, โ
โฌ โ โ Cat, Cat.get:face, 1] โ
โโโโโโโโโโโโโโโโค โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
sameFace(Dog(), ...); โ InlineCacheStub
sameFace(Dog(), ...); โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
sameFace(Cat(), ...); โโโโโโโโโโโถโ idx = cache.indexOf(classOf(this)); โ
โ if (idx != -1) { โ
โ cache[idx + 2]++; // frequency++ โ
โ return cache[idx + 1](...); โ
โ } โ
โ return InlineCacheMiss(...); โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Unoptimizing compiler by itself is enough to execute any possible Dart code. However the code it produces is rather slow, which is why VM also implements adaptive optimizing compilation pipeline. The idea behind adaptive optimization is to use execution profile of a running program to drive optimization decisions.
As unoptimized code is running it collects the following information:
- As described above, inline caches collect information about receiver types observed at callsites;
- Execution counters associated with functions and basic blocks within functions track hot regions of the code.
When an execution counter associated with a function reaches certain threshold, this function is submitted to a background optimizing compiler for optimization.
Optimizing compilations starts in the same way as unoptimizing compilation does: by walking serialized Kernel AST to build unoptimized IL for the function that is being optimized. However instead of directly lowering that IL into machine code, optimizing compiler proceeds to translate unoptimized IL into static single assignment (SSA) form based optimized IL. SSA based IL is then subjected to speculative specialization based on the collected type feedback and passed through a sequence of classical and Dart specific optimizations: e.g. inlining, range analysis, type propagation, representation selection, store-to-load and load-to-load forwarding, global value numbering, allocation sinking, etc. At the end optimized IL is lowered into machine code using linear scan register allocator and a simple one-to-many lowering of IL instructions.
Once compilation is complete background compiler requests mutator thread to enter a safepoint and attaches optimized code to the function.
The next time this function is called - it will use optimized code. Some functions contain very long running loops and for those it makes sense to switch execution from unoptimized to optimized code while the function is still running. This process is called on stack replacement (OSR) owing its name to the fact that a stack frame for one version of the function is transparently replaced with a stack frame for another version of the same function.
in hot code ICs
Kernel AST Unoptimized IL have collected type
โญโโโโโโโโโโโโโโโฎ โญโโโโโโโโโโโโโโโโโโโโโโโโฎ โฑ feedback
โ FunctionNode โ โ LoadLocal('a') โ ICData
โ โ โ LoadLocal('b') โโโโโโถโโโโโโโโโโโโโโโโโโโโโโโ
โ (a, b) => โ โฃโโโถ โ InstanceCall:1('+', โด)โ โ[(Smi, Smi.+, 10000)]โ
โ a + b; โ โ Return โฑ โ โโโโโโโโโโโโโโโโโโโโโโโ
โฐโโโโโโโโโโโโโโโฏ โฐโโโโโโโโโโโโโฑโโโโโโโโโโโฏ
deopt id โณ
โ
SSA IL โผ
โญโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ v1<-Parameter('a') โ
โ v2<-Parameter('b') โ
โ v3<-InstanceCall:1('+', v1, v2)โ
โ Return(v3) โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
โณ
โ
Machine Code โผ Optimized SSA IL
โญโโโโโโโโโโโโโโโโโโโโโโฎ โญโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ movq rax, [rbp+...] โ โ v1<-Parameter('a') โ
โ testq rax, 1 โ โโโโซโ v2<-Parameter('b') โ
โ jnz ->deopt@1 โ โ CheckSmi:1(v1) โ
โ movq rbx, [rbp+...] โ โ CheckSmi:1(v2) โ
โ testq rbx, 1 โ โ v3<-BinarySmiOp:1(+, v1, v2) โ
โ jnz ->deopt@1 โ โ Return(v3) โ
โ addq rax, rbx โ โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
โ jo ->deopt@1 โ
โ retq โ
โฐโโโโโโโโโโโโโโโโโโโโโโฏ
runtime/vm/compiler
directory. Compilation pipeline entry point is CompileParsedFunctionHelper::Compile
. IL is defined in runtime/vm/compiler/backend/il.h
. Kernel-to-IL translation starts in kernel::StreamingFlowGraphBuilder::BuildGraph
, and this function also handles construction of IL for various artificial functions. compiler::StubCodeCompiler::GenerateNArgsCheckInlineCacheStub
generates machine code for inline-cache stub, while InlineCacheMissHandler
handles IC misses. runtime/vm/compiler/compiler_pass.cc
defines optimizing compiler passes and their order. JitCallSpecializer
does most of the type-feedback based specializations.VM also has flags which can be used to control JIT and to make it dump IL and generated machine code for the functions that are being compiled by the JIT.
Flag | Description |
---|---|
--print-flow-graph[-optimized] | Print IL for all (or only optimized) compilations |
--disassemble[-optimized] | Disassemble all (or only optimized) compiled functions |
--print-flow-graph-filter=xyz,abc,... | Restrict output triggered by previous flags only to the functions which contain one of the comma separated substrings in their names |
--compiler-passes=... | Fine control over compiler passes: force IL to be printed before/after a certain pass. Disable passes by name. Pass help for more information |
--no-background-compilation | Disable background compilation, and compile all hot functions on the main thread. Useful for experimentation, otherwise short running programs might finish before background compiler compiles hot function |
--deterministic | Disable various sources of non-determinism in the VM (concurrent GC and compiler). |
For example the following command will run test.dart
and dump optimized IL and machine code for functions that contain myFunction
in their names:
$ dart --print-flow-graph-optimized \
--disassemble-optimized \
--print-flow-graph-filter=myFunction \
--no-background-compilation \
test.dart
It is important to highlight that the code generated by optimizing compiler is specialized under speculative assumptions based on the execution profile of the application. For example, a dynamic call site that only observed instances of a single class C
as a receiver will be converted into a direct call preceded by a check verifying that receiver has an expected class C
. However these assumptions might be violated later during execution of the program:
void printAnimal(obj) {
print('Animal {');
print(' ${obj.toString()}');
print('}');
}
// Call printAnimal(...) a lot of times with an intance of Cat.
// As a result printAnimal(...) will be optimized under the
// assumption that obj is always a Cat.
for (var i = 0; i < 50000; i++)
printAnimal(Cat());
// Now call printAnimal(...) with a Dog - optimized version
// can not handle such an object, because it was
// compiled under assumption that obj is always a Cat.
// This leads to deoptimization.
printAnimal(Dog());
Whenever optimized code is making some optimistic assumptions, which might be violated during the execution, it needs to guard against such violations and be able to recover if they occur.
This process of recovery is known as deoptimization: whenever optimized version hits a case which it can't handle, it simply transfers execution into the matching point of unoptimized function and continues execution there. Unoptimized version of a function does not make any assumptions and can handle all possible inputs.
VM usually discards optimized version of the function after deoptimization and then reoptimizes it again later - using updated type feedback.
print
). Matching instructions that deoptimize to positions in the unoptimized code in VM is done using deopt idsThere are two ways VM guards speculative assumptions made by the compiler:
- Inline checks (e.g.
CheckSmi
,CheckClass
IL instructions) that verify if assumption holds at use site where compiler made this assumption. For example, when turning dynamic calls into direct calls compiler adds these checks right before a direct call. Deoptimization that happens on such checks is called eager deoptimization, because it occurs eagerly as the check is reached. - Global guards which instruct runtime to discard optimized code when it changes something that optimized code relies on. For example, optimizing compiler might observe that some class
C
is never extended and use this information during type propagation pass. However subsequent dynamic code loading or class finalization can introduce a subclass ofC
- which invalidates the assumption. At this point runtime needs to find and discard all optimized code that was compiled under the assumption thatC
has no subclasses. It is possible that runtime finds some of the now invalid optimized code on the execution stack - in which case affected frames are marked for deoptimization and will deoptimize when execution returns to them. This sort of deoptimization is called lazy deoptimization: because it is delayed until control returns back to the optimized code.
runtime/vm/deopt_instructions.cc
. It is essentially a mini-interpreter for deoptimization instructions which describe how to reconstruct needed state of the unoptimized code from the state of optimized code. Deoptimization instructions are generated by CompilerDeoptInfo::CreateDeoptInfo
for every potential deoptimization location in optimized code during compilation.Flag --trace-deoptimization
makes VM print information about the cause and location of every deoptimization that occurs. --trace-deoptimization-verbose
makes VM print a line for every deoptimization instruction it executes during deoptimization.
Running from Snapshots
VM has the ability to serialize isolate's heap or more precisely object graph residing in the heap into a binary snapshot. Snapshot then can be used to recreate the same state when starting VM isolates.
โโโโโโโโโโโโโโโโ
SNAPSHOT โโโโโโโโโโโโโโโโโ
โโโโโโโโโโโโโโโโ โญโโโโโโโโโโฎ โโโโโโโโโโโโโโโโโโ
โ HEAP โ โ 0101110 โ โ HEAP โโโ
โ โโ โ โ 1011010 โ โ โโ โโโ
โ โฑ โฒ โ โ 1010110 โ โ โฑ โฒ โโโ
โ โโโฒ โโ โ โ 1101010 โ โ โโโฒ โโ โโโ
โ โฒ โฑ โฒ โโฃโโโโโโโโโโถโ 0010101 โโฃโโโโโโโโโโถโ โฒ โฑ โฒ โโโ
โ โณ โโ โ serialize โ 0101011 โdeserializeโ โณ โโ โโโ
โ โฑ โฒ โฑ โ โ 1111010 โ โ โฑ โฒ โฑ โโโ
โ โโ โโ โ โ 0010110 โ โ โโ โโ โโโ
โ โ โ 0001011 โ โ โโ
โโโโโโโโโโโโโโโโ โฐโโโโโโโโโโฏ โโโโโโโโโโโโโโโโ
Snapshot's format is low level and optimized for fast startup - it is essentially a list of objects to create and instructions on how to connect them together. That was the original idea behind snapshots: instead of parsing Dart source and gradually creating internal VM data structures, VM can just spin an isolate up with all necessary data structures quickly unpacked from the snapshot.
Initially snapshots did not include machine code, however this capability was later added when AOT compiler was developed. Motivation for developing AOT compiler and snapshots-with-code was to allow VM to be used on the platforms where JITing is impossible due to platform level restrictions.
Snapshots-with-code work almost in the same way as normal snapshots with a minor difference: they include a code section which unlike the rest of the snapshot does not require deserialization. This code section laid in way that allows it to directly become part of the heap after it was mapped into memory.
โโโโโโโโโโโโโโโโ
SNAPSHOT โโโโโโโโโโโโโโโโโ
โโโโโโโโโโโโโโโโ โญโโโโโฎโญโโโโโฎ โโโโโโโโโโโโโโโโโโ
โ HEAP โ โ 01 โโ โ โ HEAP โโโ
โ โโ โ โ 10 โโ โโโโโโ โ โโ โโโ
โ โฑ โฒ โ โ 10 โโ โ โ โ โฑ โฒ โโโ
โ โโโฒ โโ โ โ 11 โโ โ โโโโโโโโโโโโโโโโฒ โโ โโโ
โ โฑ โฒ โฑ โฒ โโฃโโโโโโโโโโถโ 00 โโ โโฃโโโโโโโโโโถโ โฒ โฑ โฒ โโโ
โ โโ โณ โโ โ serialize โ 01 โโ โdeserializeโ โณ โโ โโโ
โ โฑ โฒ โฑ โ โ 11 โโ โ โ โฑ โฒ โฑ โโโ
โ โโ โโโฒ โ โ 00 โโ โ โ โโ โโ โโโ
โ โโ โ โ 00 โโ โโโโโโโโโโโโโโโโโโโโโโโโโโ โโ
โโโโโโโโโโโโฑโโโโ โฐโโโโโฏโฐโโโโโฏ โโโโโโโโโโโโโโโโ
code data code
runtime/vm/app_snapshot.cc
handles serialization and deserialization of snapshots. A family of API functions Dart_CreateXyzSnapshot[AsAssembly]
are responsible for writing out snapshots of the heap (e.g. Dart_CreateAppJITSnapshotAsBlobs
and Dart_CreateAppAOTSnapshotAsAssembly
). On the other hand Dart_CreateIsolateGroup
optionally takes snapshot data to start an isolate from.Running from AppJIT snapshots
AppJIT snapshots were introduced to reduce JIT warm up time for large Dart applications like dartanalyzer
or dart2js
. When these tools are used on small projects they spent as much time doing actual work as VM spends JIT compiling these apps.
AppJIT snapshots allow to address this problem: an application can be run on the VM using some mock training data and then all generated code and VM internal data structures are serialized into an AppJIT snapshot. This snapshot can then be distributed instead of distributing application in the source (or Kernel binary) form. VM starting from this snapshot can still JIT - if it turns out that execution profile on the real data does not match execution profile observed during training.
โโโโโโโโโโโโโโโโ
SNAPSHOT โโโโโโโโโโโโโโโโโ
โโโโโโโโโโโโโโโโ โญโโโโโฎโญโโโโโฎ โโโโโโโโโโโโโโโโโโ
โ HEAP โ โ 01 โโ โ โ HEAP โโโ
โ โโ โ โ 10 โโ โโโโโโ โ โโ โโโ
โ โฑ โฒ โ โ 10 โโ โ โ โ โฑ โฒ โโโ
โ โโโฒ โโ โ โ 11 โโ โ โโโโโโโโโโโโโโโโฒ โโ โโโ
โ โฑ โฒ โฑ โฒ โโฃโโโโโโโโโโถโ 00 โโ โโฃโโโโโโโโโโถโ โฒ โฑ โฒ โโโ
โ โโ โณ โโ โ serialize โ 01 โโ โdeserializeโ โณ โโ โโโ
โ โฑ โฒ โฑ โ โ 11 โโ โ โ โฑ โฒ โฑโ โโโ
โ โโ โโโฒ โ โ 00 โโ โ โ โโ โโ โ โโโ
โ โโ โ โ 00 โโ โโโโโโโโโโโโโโโโโโโโโโโโโโ โโ โโ
โโโโโโโโโโโโฑโโโโ โฐโโโโโฏโฐโโโโโฏ โโโโโโโโโโโโฑโโโโ
code data code isolate can JIT more
dart
binary will generate AppJIT snapshot after running the application if you pass --snapshot-kind=app-jit --snapshot=path-to-snapshot
to it. Here is an example of generating and using an AppJIT snapshot for dart2js
.
- Run from source in JIT mode
$ dart pkg/compiler/lib/src/dart2js.dart -o hello.js hello.dart
Dart file (hello.dart) compiled to JavaScript: hello.js
- Create an app-jit snapshot trained by compiling
dart2js
with itself, then run from this snapshot.
$ dart --snapshot-kind=app-jit --snapshot=dart2js.snapshot \
pkg/compiler/lib/src/dart2js.dart -o hello.js hello.dart
Dart file (hello.dart) compiled to JavaScript: hello.js
$ dart dart2js.snapshot -o hello.js hello.dart
Dart file (hello.dart) compiled to JavaScript: hello.js
Running from AppAOT snapshots
AOT snapshots were originally introduced for platforms which make JIT compilation impossible, but they can also be used in situations where fast startup and consistent performance is worth potential peak performance penalty.
Inability to JIT implies that:
- AOT snapshot must contain executable code for each and every function that could be invoked during application execution;
- the executable code must not rely on any speculative assumptions that could be violated during execution;
To satisfy these requirements the process of AOT compilation does global static analysis (type flow analysis or TFA) to determine which parts of the application are reachable from known set of entry points, instances of which classes are allocated and how types flow through the program. All of these analyses are conservative: meaning that they err on the side of correctness - which is in stark contrast with JIT which can err on the side of performance, because it can always deoptimize into unoptimized code to implement correct behavior.
All potentially reachable functions are then compiled to native code without any speculative optimizations. However type flow information is still used to specialize the code (e.g. devirtualize calls).
Once all functions are compiled a snapshot of the heap can be taken.
Resulting snapshot can then be run using precompiled runtime, a special variant of the Dart VM which excludes components like JIT and dynamic code loading facilities.
โญโโโโโโโโโโโโโโฎ โญโโโโโโโโโโโโโฎ
โโญโโโโโโโโโโโโโโฎ โโโโโโโ โ Kernel AST โ
โโโญโโโโโโโโโโโโโโฎโฃโโโโถ โ CFE โ โฃโโโโถ โ โ
โโโ Dart Source โ โโโโโโโ โ whole โ
โโโ โ โ program โ
โโ โ โฐโโโโโโโโโโโโโฏ
โ โ โณ
โ
โผ
โโโโโโโ type-flow analysis
โ TFA โ propagates types globally
VM contains an โโโโโโโ through the whole program
AOT compilation โณ
pipeline which โ
reuses parts of โผ
JIT pipeline โญโโโโโโโโโโโโโฎ
โญโโโโโโโโโโโโโฎ โฒ โ Kernel AST โ
โAOT Snapshotโ โโโโโโ โ โ
โ โโโโโโซ โ VM โ โโโโโซ โ inferred โ
โ โ โโโโโโ โ treeshaken โ
โฐโโโโโโโโโโโโโฏ โฐโโโโโโโโโโโโโฏ
package:vm/transformations/type_flow/transformer.dart
is an entry point to the type flow analysis and transformation based on TFA results. Precompiler::DoCompileAll
is an entry point to the AOT compilation loop in the VM.AOT compilation pipeline is currently packaged into Dart SDK as dart compile exe
command.
$ dart compile exe -o hello hello.dart
$ ./hello
Hello, World!
It is possible to pass options like --print-flow-graph-optimized
and --disassemble-optimized
to the dart compile exe
via --extra-gen-snapshot-options
flag. You also need to pass --verbose
to see the output, otherwise it is silently swallowed by the tool.
$ dart compile exe --verbose \
--extra-gen-snapshot-options=--print-flow-graph-optimized \
--extra-gen-snapshot-options=--print-flow-graph-filter=main \
--extra-gen-snapshot-options=--disassemble \
hello.dart
Runtime System
Object Model
Representation of Types
GC
See GC.
Compiler
Method Calls
There is currently large difference between how AOT and JIT optimize method invocation sequences.
JIT keeps to its Dart 1 roots and largely ignores statically typed nature of Dart 2. In unoptimized code method calls by default go through an inline cache which collects type feedback. Optimizing compiler then speculatively specializes indirect method calls into direct calls guarded by class checks. This process is called speculative devirtualization. Those call sites which can't be devirtualized are divided into two categories. Those call sites which have not been executed yet are compiled to use inline caching and collect type feedback for subsequent reoptimizations. Those call sites which are highly polymorphic (megamorphic) are compiled to use metamorphic dispatch.
On the other hand, AOT heavily leans onto the statically typed nature of Dart 2. The compiler uses results of global type flow analysis (TFA) to devirtualize as many call sites as it can. This devirtualization is not speculative: compiler only devirtualizes the call site if it can prove that it always invokes a specific method. If compiler can not devirtualize a call site, then it chooses a dispatch mechanism based on whether the receiver's static type is dynamic
or not. Calls on dynamic
receiver use switchable calls. All other calls go through a global dispatch table.
Global Dispatch Table (GDT)
Imagine for a moment that each class defined in the program added its methods to a global dictionary. For example, given the following class hierachy
class A {
void foo() { }
void bar() { }
}
class B extends A {
void foo() { }
void baz() { }
}
This dictionary will contain the following:
globalDispatchTable = {
// Calling [foo] on an instance of [A] hits [A.foo].
(A, #foo): A.foo,
// Calling [bar] on an instance of [A] hits [A.bar].
(A, #bar): A.bar,
// Calling [foo] on an instance of [B] hits [B.foo].
(B, #foo): B.foo,
// Calling [bar] on an instance of [B] hits [A.bar].
(B, #bar): A.bar,
// Calling [baz] on an instance of [B] hits [B.baz].
(B, #baz): B.baz
};
Compiler could then use such a dictionary to dispatch invocations: a method call o.m(...)
will be compiled into globalDispatchTable[(classOf(o), #m)](o, ...)
.
A naive approach to representing globalDispatchTable
(or gdt
for short) is to number all classes and all method selectors in the program sequentially and then use a two-dimensional array: gdt[(classOf(o), #m)]
becomes gdt[o.cid][#m.id]
. At this point we can choose to flatten this two-dimensional array either using selector-major order (gdt[numClasses * #m.id + o.cid]
) or class-major order (gdt[numSelectors * o.cid + #m.id]
) .
Let us take a look at selector-major order. In this representation we say that numClasses * #m.id
gives us selector offset: an offset into the GDT at which a row of entries (one per class) corresponding to this selector is stored. Consider the following class hierarchy:
class A {
void foo() { }
}
class B extends A {
void foo() { }
}
class C {
void bar() { }
}
class D extends C {
void bar() { }
}
Classes A
, B
, C
and D
will be numbered 0, 1, 2 and 3 respectively, while selectors foo
and bar
will be numbered 0
and 1
. This will lead to the following array:
offset 0 4
โA B C D โ
โโโโโโโฌโโโโโโฌโโโโโโฌโโโโโโ
foo row โA.fooโB.fooโ NSM โ NSM โ
โโโโโโโดโโโโโโดโโโโโโดโโโโโโ
โA B C D
โโโโโโโฌโโโโโโฌโโโโโโฌโโโโโโ
bar row โ NSM โ NSM โC.barโD.barโ
โโโโโโโดโโโโโโดโโโโโโดโโโโโโ
โโโโโโโฌโโโโโโฌโโโโโโฌโโโโโโฌโโโโโโฌโโโโโโฌโโโโโโฌโโโโโโ
GDT โA.fooโB.fooโ NSM โ NSM โ NSM โ NSM โC.barโD.barโ
โโโโโโโดโโโโโโดโโโโโโดโโโโโโดโโโโโโดโโโโโโดโโโโโโดโโโโโโ
It's evident that such representation is rather memory inefficient: dispatch table ends up with a lot of NSM (noSuchMethod
) entries.
Fortunately, Dart 2 static type system provides us with a way to compress this table. In Dart 2 static type of the receiver constrains the list of selectors allowed by the compiler. This guarantees that any non-dynamic
invocation calls an actual method rather than Object.noSuchMethod
. Consequently, if we only use dispatch table for non-dynamic
call sites then we don't need to fill holes in the table with NSM entries.
This leads to the following idea: instead of numbering selectors sequentially and using numClasses * sid
as a selector offset, we could instead select selector offsets which causes selector rows to interleave and reuse available holes.
Let us look back to the previous example with 4 classes. Instead of numbering foo
with 0
and bar
with 1
and using 0
and 4
as a selector offsets respectively, we could simply assign both selectors an offset of 0
leading to the following compact table
offset 0
โA B C D
โโโโโโโฌโโโโโโฌโโโโโโฌโโโโโโ
foo row โA.fooโB.fooโโโโโโโโโโโโโ
โโโโโโโดโโโโโโดโโโโโโดโโโโโโฒ
A B C D hole
โโโโโโโฌโโโโโโฌโโโโโโฌโโโโโโ
bar row โโโโโโโโโโโโโC.barโD.barโ
โโโโโโโดโโโโโโดโโโโโโดโโโโโโ
โโโโโโโฌโโโโโโฌโโโโโโฌโโโโโโ
GDT โA.fooโB.fooโC.barโD.barโ
โโโโโโโดโโโโโโดโโโโโโดโโโโโโ
This works because it is impossible to invoke bar
on A
or B
and it's impossible to invoke foo
on C
or D
- meaning, for example, that A.foo
entry will never be hit with an instance of C
as receiver.
Calls through GDT compile to the following machine code in Dart VM (X64 example):
movzx cid, word ptr [obj + 15] ; load receiver's class id
call [GDT + cid * 8 + (selectorOffset - 16) * 8]
Here GDT
is a reserved register containing a biased pointer to the GDT (&GDT[16]
on X64) and selectorOffset
is an offset of the selector we are invoking. The call looks similar across architectures, though concrete value of the bias (specified by DispatchTable::kOriginElement
) depends on the target architecture. We bias GDT
pointer to have a more compact call sequence encoding for smaller selectors, e.g. on X64 an indirect call
has an encoding which allows for an 1 byte signed immediate offset. This means that imeddiate offsets in the range -128
to 127
are represented as a single byte. With an unbiased GDT pointer we would only be able to utilize half of this range because selectorOffset
is an unsigned value. With biased GDT we can use the full range:selectorOffset
15
still requires just one byte encoding.
Computation of the global dispatch table is spread through different parts of the toolchain.
TableSelectorAssigner
is responsible for assigning selector ids to methods in the program.DispatchTableCallInstr
is an IL instruction representing a call through GDT.AotCallSpecializer::ReplaceInstanceCallsWithDispatchTableCalls
is a compiler pass which replaces non-devirtualized method calls with GDT calls.FlowGraphCompiler::EmitDispatchTableCall
emits architecture specific call sequence for calls through GDT.compiler::DispatchTableGenerator
is responsible for assigning selector offsets and computing final layout of the table.
Switchable Calls
Switchable call is an extension of an inline caching originally developed for Dart 1 AOT - where they were used to compile all method calls. Current AOT only uses them when compiling calls with a dynamic
receiver. They are also used in JIT to speedup calls from unoptimised code
JIT section already described that each inline cache associated with a call site consists of two pieces: a cache object (represented by an instance of UntaggedICData
) and a chunk of native code to invoke (e.g. an inline cache stub). Original implementation in JIT was only updating the cache itself, however this was later extended by allowing runtime system to update both the cache and the stub target depending on the types observed by the call site.
UnlinkedCall
cache โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โโโโโโโโโโโถโtargetName: "method" โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โด
object.method()
โฌ SwitchableCallMissStub
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โโโโโโโโโโโถโ return DRT_SwitchableCallMiss(...);โ
target โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Initially all dynamic
calls in AOT start in the unlinked state. When such call-site is reached for the first time SwitchableCallMissStub is invoked, which simply calls into runtime helper DRT_SwitchableCallMiss
to link this call site.
If possible DRT_SwitchableCallMiss
tries to transition the call site into a monomorphic state. In this state call site turns into a direct call, which enters method through a special entry point which verifies that receiver has expected class.
cache โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โโโโโโโโโโโถโid of class C โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โด
object.method()
โฌ C.method
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โโโโโโโโโโโถโ monomorphic_entry: โ
target โ // Check if receiver's cid โ
โ // matches the cached one. โ
โ if (this.cid != cache) โ
โ return SwitchableCallMissStub()โ
normal calls โ // fall through to normal entry โ
enter here โโโโโถโ normal_entry: โ
โ // Body of C.method โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
In the example above we assume that obj
was an instance of C
and that obj.method
resolved to C.method
when obj.method()
was executed for the first time.
Next time we execute the same call-site it will invoke C.method
directly bypassing method lookup process. However it will enter C.method
through a special entry point, which will verify that obj
is still an instance of C
. If that is not the case DRT_SwitchableCallMiss
will be invoked and will update call site state to reflect the miss.
C.method
might still be a valid target for an invocation, e.g obj
is an instance of the class D
which extends C
but does not override C.method
. In this case we check if call site could transition into a single target state, implemented by SingleTargetCallStub (see also UntaggedSingleTargetCache
).
SingleTargetCache
cache โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โโโโโโโโโโโถโfromCid, toCid, target โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โด
object.method() SingleTargetCallStub
โฌ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โโโโโโโโโโโถโ if (cache.fromCid <= this.cid && โ
target โ this.cid <= cache.toCid ) โ
โ return cache.target(...); โ
โ // Not found โ
โ return SwitchableCallMissStub(...);โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
This stub benefits from depth-first class id assignment done by AOT compiler and during AppJIT snapshot training. In this mode most classes are assigned integer ids using depth-first traversal of the inheritance hierarchy. If C
is a base class with subclasses D0, ..., Dn
and none of those override C.method
then C.:cid <= classId(obj) <= max(D0.:cid, ..., Dn.:cid)
implies that obj.method
resolves to C.method
. In such cases instead of comparing for equality (which checks a for a specific class), we can instead compare if class id falls into a specific range and that will cover all subclasses of C
. That's exactly what SingleTargetCallStub
does.
If single target case is not applicable call site is switched to use linear search inline cache. Which coincidentally is also the initial state for call-sites in JIT mode (see ICCallThroughCode
stub, UntaggedICData
and DRT_SwitchableCallMiss
).
ICData
cache โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โโโโโโโโโโโถโ{cid0: target0, cid1: target1, ... }โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โด
object.method() ICCallThroughCodeStub
โฌ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โโโโโโโโโโโถโ for (i = 0; i < cache.length; i++) โ
target โ if (cache[i] == this.cid) โ
โ return cache[i + 1](...); โ
โ // Not found โ
โ return SwitchableCallMissStub(...);โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Finally if the number of checks in the linear array grows past threshold the call site is switched to use a dictionary like structure (see MegamorphicCallStub, UntaggedMegamorphicCache
and DRT_SwitchableCallMiss
).
MegamorphicCache
cache โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โโโโโโโโโโโถโ{cid0: target0, cid1: target1, ... }โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โด
object.method() MegamorphicCallStub
โฌ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โโโโโโโโโโโถโ var target = cache[this.cid]; โ
target โ if (target != null) โ
โ return target(...); โ
โ // Not found โ
โ return SwitchableCallMissStub(...);โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
try
/catch
in IL
See Exceptions Implementation.
async
, async*
and sync*
methods
as
checks
See Type Testing Stubs.
Miscellaneous
Pragmas
See VM-Specific Pragma Annotations.
DWARF (non-symbolic) stack traces
Glossary
See Glossary