Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design and implement the shared linear IRs (HIR, MIR, LIR) #60

Open
Tracked by #58 ...
alexrp opened this issue Mar 26, 2023 · 0 comments
Open
Tracked by #58 ...

Design and implement the shared linear IRs (HIR, MIR, LIR) #60

alexrp opened this issue Mar 26, 2023 · 0 comments
Assignees
Labels
area: runtime Issues related to the runtime system. state: approved Enhancements and tasks that have been approved.
Milestone

Comments

@alexrp
Copy link
Member

alexrp commented Mar 26, 2023

There will be 3 IRs in the runtime core. They will all be shared between the interpreter and the JIT compiler. Initially, we will implement HIR and MIR only, with the interpreter prototype (#58) consuming MIR. Later, as we reduce dependence on .NET types, we will implement and consume LIR in the interpreter. Finally, the JIT compiler will be implemented, which will transform LIR to AIR (#81), and then compile AIR to machine code.

The runtime will be based on lazy basic block versioning:

This is important to know to understand the IR design and behaviors described below.

HIR

High-Level IR (HIR) is the first intermediate representation. It mainly focuses on linearizing the code, turning it into SSA form, and desugaring some high-level language concepts. HIR is constructed from the semantic tree upfront when a module is loaded, and never changes after that.

HIR features basic blocks (with parameters), upvalues, constants, and operations as building blocks. Operation value operands can be upvalues, basic block parameters, constants, and (non-void) operations; there are no explicit variables or temporaries. Code is in SSA form, with basic block parameters serving as Φ nodes. There is no propagation of explicit or inferred type information at this stage, but all type tests are made explicit.

Lowering to HIR gets rid of some high-level language concepts like pattern matching, agent send/receive syntax, defer statements, for expressions, try expressions, etc.

The HIR data structures will be very minimalistic and will not be amenable to analysis. For example, there will be no use/definition chains. HIR is only really meant to be walked during lowering to MIR. In other words, HIR serves as a template for specialization in MIR.

MIR

Mid-Level IR (MIR) is where type specialization and most optimizations happen. It is similar to HIR in the building blocks it has, but unlike HIR, everything now carries type information. Types are gathered from the running program through basic block versioning, entry point versioning, value shapes, etc.

Lowering from HIR to MIR happens on demand as the program executes code, and is done with basic block granularity. Due to type specialization, there can be many different versions of MIR code for a given HIR (extended) basic block. Lowering proceeds until a type test is encountered that cannot be resolved with the available type information, or until the end of the function is encountered.

MIR will maintain use/definition chains and various other data structures that simplify transformation of the code. This will facilitate a classic set of optimizations (#61) that can be performed now that type information is available.

LIR

Low-Level IR (LIR) decomposes managed values into their constituent raw value and shape words. LIR mostly has the same building blocks as HIR and MIR, but at this stage, the only types that exist are 64-bit integers, 64-bit floats, and untyped pointers. All high-level operations will have been decomposed to primitive CPU-like operations. LIR is essentially a simple register transfer language.

LIR allows certain optimizations that would be harder to express at the MIR level. For example, in a series of small integer operations, it's obvious that copying the shape word for every intermediate operation is unnecessary. Yet, because MIR only operates on managed values, this notion cannot be expressed. At the LIR level, it is trivial to detect and remove such copies.

Note that, while LIR is very close to the machine, it is not architecture-specific.

@alexrp alexrp added state: approved Enhancements and tasks that have been approved. type: feature area: runtime Issues related to the runtime system. labels Mar 26, 2023
@alexrp alexrp added this to the v1.0 milestone Mar 26, 2023
@alexrp alexrp self-assigned this Mar 26, 2023
@alexrp alexrp changed the title Design and implement the linear IR (MIR) Design and implement the linear IRs (HIR, MIR, LIR) May 5, 2023
@alexrp alexrp changed the title Design and implement the linear IRs (HIR, MIR, LIR) Design and implement the shared linear IRs (HIR, MIR, LIR) May 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: runtime Issues related to the runtime system. state: approved Enhancements and tasks that have been approved.
Development

No branches or pull requests

1 participant