Code generation
The codegen module translates the high-level expression grammar into MiniRust basic blocks. Its entry point is codegen_function, which takes a function body (a Block) and produces a lang::Function containing a map of basic blocks, local variables, and an entry point.
CodeBlock: the control-flow builder
The core abstraction is CodeBlock — a single-entry, multi-exit control-flow code block that is incrementally constructed. It is an enum with three variants representing the builder’s state:
/// A single-entry, multi-exit control flow code.
#[derive(Clone, PartialEq, Eq, PartialOrd, Ord, Hash, Debug)]
pub enum CodeBlock {
/// Anonymous open block being constructed. Has no name yet — a name is
/// allocated lazily when the block is terminated or needs to be referenced.
///
/// Suppose you have `let x = 22;` this would create a set of statements like
///
/// ```text
/// stmts: [
/// let tmp1;
/// tmp1 = 22;
/// let x;
/// x = tmp;
/// ]
/// ```
///
/// and then you have `let y = 33;`
///
/// ```text
/// stmts: [
/// let tmp2;
/// tmp2 = 33;
/// let y;
/// y = tmp2;
/// ]
/// ```
///
/// and then you append those two so you get
///
/// ```text
/// stmts = [... /* as above, concatenated */]
/// ```
Block { stmts: Vec<lang::Statement> },
/// Finalized blocks plus one open fallthrough block.
///
/// For something like this `if foo { 22 } else { 33 }; ...``
///
/// ```text
/// blocks:
/// a {
/// if foo goto b else c
/// }
/// b {
/// tmp = 22
/// goto d
/// }
/// c {
/// tmp = 33
/// goto d
/// }
/// entry: a
/// fallthrough: d
/// fallthrough_stmts: [...] // from the `...` above
/// }}
/// ```
Open {
/// Blocks that have a terminator and thus are "done".
/// They may branch to each other or to the (incomplete) fallthrough block, as shown above.
blocks: Map<lang::BbName, lang::BasicBlock>,
/// The entry block in the list above or the fallthrough.
entry: lang::BbName,
/// The name of the not-yet-compelted fallthrough block, which may be referenced above.
fallthrough: lang::BbName,
/// The statements in the fallthrough block so far.
fallthrough_stmts: Vec<lang::Statement>,
},
/// Finalized blocks (i.e., blocks with terminators) only. No successors.
Closed {
/// Blocks that have a terminator and thus are "done".
blocks: Map<lang::BbName, lang::BasicBlock>,
/// The entry block in the list above or the fallthrough.
entry: lang::BbName,
},
}
Lifecycle
A code block starts as an anonymous Block (just accumulating statements) and transitions to named states (Open / Closed) once it needs basic-block structure:
flowchart LR
start(( )) -->|"new()"| Block
start -->|"named(bb)"| Open
Block -->|"terminate(_, None)"| Closed
Block -->|"terminate(_, Some(bb))"| Open
Closed -->|"terminate(_, Some(bb))"| Open
Open -->|"terminate(_, None)"| Closed
Open -->|"terminate(_, Some(bb))"| Open
Block["
Block
Anonymous — just a list of statements with no name yet. A name is allocated lazily when the block is terminated.
"]
Closed["
Closed
All blocks are finalized. No fallthrough.
"]
Open["
Open
Finalized blocks plus one open fallthrough block (named, accepting statements).
"]
Block is the starting state for simple expressions — statements accumulate without allocating any basic-block names. Most expressions stay in Block and get inlined into their parent via append. Once a terminator is needed (a call, branch, or explicit control flow), terminate names the block and transitions to Closed or Open.
Open is the state for regions that already have named blocks. Statements accumulate into the named fallthrough block. terminate finalizes the fallthrough and either closes the code block or opens a new fallthrough.
Once a code block is Closed, mutations (with_stmt, terminate) are no-ops. This naturally handles dead code after return or break. However, terminate with Some(next_bb) on a Closed code block will re-open it — this is used for blocks reachable via break that aren’t continuations of the previous terminator.
Core operations
| Operation | Effect |
|---|---|
with_stmt(s) | Appends a statement to the current fallthrough (Block or Open) |
terminate(t, None) | Finalizes the fallthrough block → Closed |
terminate(t, Some(bb)) | Finalizes the fallthrough block, opens bb as the new fallthrough → Open |
append(other) | Sequential composition (see below) |
into_blocks() | Extracts the final Map<BbName, BasicBlock> |
The append operation
append is how sequential composition works. Its behavior depends on whether other is anonymous or named:
| self | other | Behavior |
|---|---|---|
| Block | Block | Concatenate statements (both anonymous — no names needed) |
| Block | Open/Closed | Name self, emit Goto to other’s entry, merge other’s blocks |
| Open | Block | Extend fallthrough with other’s statements |
| Open | Open/Closed | Emit Goto from fallthrough to other’s entry, merge other’s blocks |
The key invariant: once a block is named, it is never destroyed. When appending a named code block (Open or Closed), its entry block is preserved as a jump target — append emits a Goto to it rather than inlining it. This ensures that blocks referenced by jumps (loop headers, break targets) always survive in the final map.
Compound helpers
These methods on CodegenFn combine terminate with block allocation:
| Helper | What it does |
|---|---|
call(code block, fn, args, ret) | Allocates next_bb, terminates with Call { next_block: next_bb } and opens next_bb |
branch_on_bool_from(code block, cond, then, else) | Allocates a join_bb, terminates with Switch, absorbs both branches, opens join_bb |
build_loop(loop_start, exit, body) | Creates a named code block at loop_start, appends body, back-edge to loop_start, opens exit |
Judgments
codegen_block
The top-level block codegen creates a fresh code block, then appends each statement’s code block in sequence:
codegen_block(global: CodegenGlobal, cfn: CodegenFn, scope: CodegenScope, block: Block,) => (CodeBlock, CodegenGlobal, CodegenFn)
codegen_stmt
Each statement produces a CodeBlock that gets appended to the enclosing block’s code block:
codegen_stmt(global: CodegenGlobal, cfn: CodegenFn, scope: CodegenScope, stmt: Stmt,) => (CodeBlock, CodegenScope, CodegenGlobal, CodegenFn)
codegen_expr_into
Expressions are compiled into a target local. Each rule returns a CodeBlock:
codegen_expr_into(global: CodegenGlobal, cfn: CodegenFn, scope: CodegenScope, target: MiniRustLocal, expr: Expr,) => (CodeBlock, CodegenGlobal, CodegenFn)
Walkthrough: literal expression
The simplest case — a literal like 42_u32:
(let code = cfn.fresh_code_block())
---- ("literal")
(codegen_expr_into(global, cfn, scope, target, Expr::Literal { value, ty }) => (
code.assign(target, constant(value, ty)),
global,
cfn,
))
fresh_code_block()creates aCodeBlock::Block { stmts: [] }(anonymous).assign(target, constant(42, u32))pushes oneAssignstatement → stillBlock { stmts: [assign] }
The caller appends this single-block code block into its own code block. Since it’s just a Block, append concatenates the statements directly — no basic-block names are allocated.
graph LR
subgraph "returned CodeBlock (Block)"
anon["(anonymous): target = const 42_u32"]
end
Walkthrough: function call
A function call like foo(a, b) is more interesting:
(type_expr(cfn, scope, callee) => callee_ty)
(resolve_rigid(cfn, scope, callee_ty) => RigidTy { name: RigidName::FnDef(fn_id), parameters })
(let (fn_name, global) = global.ensure_fn(MonoKey::new(fn_id, parameters)))
(let (callee_temp, cfn) = cfn.alloc_temp(&callee_ty)?)
(codegen_expr_into(global, cfn, scope, callee_temp, callee) => (code, global, cfn))
(let (temps, cfn) = alloc_temps_for_args(cfn, scope, args)?)
(for_all(i in 0..args.len()) with(code, global, cfn)
(codegen_expr_into(global, cfn, scope, &temps[i], &args[i]) => (arg_code, global, cfn))
(let (code, cfn) = cfn.append_from(code, arg_code)))
(let (code, cfn) = cfn.call(code, fn_name, temps, target)?)
---- ("call")
(codegen_expr_into(global, cfn, scope, target, Expr::Call { callee, args }) => (code, global, cfn))
- Fresh code block starts as
Block { stmts: [] } - For each argument,
codegen_expr_intoreturns a Block, andappendconcatenates its statements .call(...)allocatesnext_bb, terminates withCall { next_block: next_bb }→ Open withnext_bbas fallthrough
graph LR
subgraph "returned CodeBlock (Open)"
bb0["bb0: arg stmts... → Call foo"]
bb1["bb1: (open, empty)"]
bb0 --> bb1
end
The caller can now append more work into bb1.
Walkthrough: if/else
An if b { ... } else { ... } statement:
(let (ct, cfn) = cfn.alloc_local(bool_ty()))
(codegen_expr_into(global, cfn, scope, ct, condition) => (cond_code, global, cfn))
(codegen_block(global, cfn, scope, then_block) => (then_code, global, cfn))
(codegen_block(global, cfn, scope, else_block) => (else_code, global, cfn))
(let (code, cfn) = cfn.branch_on_bool_from(cond_code, ct, then_code, else_code))
---- ("if")
(codegen_stmt(global, cfn, scope, Stmt::If { condition, then_block, else_block }) => (code, scope, global, cfn))
- Codegen the condition into a temp (
cond_region— typically a Block) - Codegen both branches independently (each returns its own CodeBlock)
branch_on_bool_fromterminates cond_region with a Switch, absorbs both branches, and opens a join block
graph TD
subgraph "returned CodeBlock"
cond["bb0: eval condition → Switch"]
then_bb["bb1: then branch"]
else_bb["bb2: else branch"]
join["bb3: (open, join point)"]
cond -->|"true"| then_bb
cond -->|"false"| else_bb
then_bb -->|Goto| join
else_bb -->|Goto| join
end
If both branches diverge (e.g., both return), there is no join block — the result is Closed.
Walkthrough: loop
A 'label: loop { body }:
(let label = require_label(label)?)
(let (loop_start, cfn) = cfn.fresh_bb())
(let (exit_block, cfn) = cfn.fresh_bb())
(let scope = scope.with_label(&label.id, Some(loop_start), exit_block))
(codegen_block(global, cfn, scope, body) => (body_code, global, cfn))
(let (code, cfn) = build_loop(cfn, loop_start, exit_block, body_code)?)
---- ("loop")
(codegen_stmt(global, cfn, scope, Stmt::Loop { label, body }) => (code, scope, global, cfn))
The build_loop helper constructs:
CodeBlock::named(loop_start)— starts in Open state at loop_start- Append the body code block (emits Goto from parent fallthrough into loop_start)
- If body has fallthrough, terminate with
Goto(loop_start)back-edge and openexit - If body diverges, open
exitanyway (reachable viabreak)
graph TD
subgraph "returned CodeBlock (Open)"
loop_start["loop_start: body stmts... → Goto loop_start"]
exit["exit: (open, fallthrough)"]
loop_start -->|"back-edge"| loop_start
loop_start -.->|"break"| exit
end
A break 'label compiles to terminate(Goto(exit_block)), directing control to the exit.
When this loop code block is appended into the parent, the parent’s fallthrough block gets a Goto to loop_start — since loop_start is a named block, append preserves it as a jump target rather than inlining it.
Putting it all together
The final step is build_function, which:
- If the body code block still has fallthrough, appends a unit assignment + Return terminator
- Extracts the entry block name
- Calls
into_blocks()to get the finalMap<BbName, BasicBlock> - Prepends
StorageLivestatements to the entry block for all non-argument locals - Returns a
lang::Functionready for MiniRust interpretation