All 12 sections are now fully defined. Removed incremental-progress markers (Status: defined), updated intro and outline preamble, and fixed two missing cases in the assigns() function (struct literals and parenthesised expressions). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
63 KiB
Flux Language Semantics
This document formalises the static semantics — primarily the type system — of Flux.
Notation used throughout:
| Symbol | Meaning |
|---|---|
Γ |
Type environment (maps names to their type + mutability) |
Σ |
Struct table (maps struct names to their field lists) |
Φ |
Function table (maps function names to their signatures) |
Γ ⊢ e : T |
Under environment Γ, expression e has type T |
Γ ⊢ s ✓ |
Under environment Γ, statement s is well-typed |
⊢ T wf |
Type T is well-formed |
T₁ ~ T₂ |
Types T₁ and T₂ are compatible (assignable) |
T ⊑ U |
T can be implicitly promoted to U |
common(T, U) |
Least upper bound of T and U under ⊑; error if none exists |
rank(T) |
Numeric rank of type T (width class; defined in §5) |
category(T) |
Numeric category of T: Unsigned, Signed, Float, or Char |
Outline
The following sections form the complete type system, in dependency order — each section relies only on those above it.
- Types ✓
- Well-Formedness of Types ✓
- Type Environments ✓
- Mutability ✓
- Numeric Type Hierarchy and Coercions ✓
- Operator Typing ✓
- Expression Typing ✓
- Statement Typing ✓
- Function Definitions ✓
- Struct Definitions ✓
- Program Typing ✓
- Error Catalogue ✓
1. Types
The type language — the set of types that Flux values can have.
1.1 Primitive types
T ::= u8 | u16 | u32 | u64 -- unsigned integers
| i8 | i16 | i32 | i64 -- signed integers
| f32 | f64 -- floating-point
| bool -- boolean
| char -- Unicode scalar value; representation-compatible with u32
1.2 Numeric type categories
Each numeric primitive belongs to exactly one category. Categories partition the numeric types; implicit promotion never crosses a category boundary (see §5).
| Category | Members | Notes |
|---|---|---|
Unsigned |
u8, u16, u32, u64 |
|
Signed |
i8, i16, i32, i64 |
|
Float |
f32, f64 |
|
Char |
char |
Promotes into Unsigned only (see §5.3) |
For the purposes of operator rules, the union categories are:
- Integer =
Unsigned ∪ Signed - Numeric =
Unsigned ∪ Signed ∪ Float
bool and char are not members of Numeric; they are handled separately.
1.3 Bit widths
| Type | Width | Type | Width |
|---|---|---|---|
u8 |
8 | i8 |
8 |
u16 |
16 | i16 |
16 |
u32 |
32 | i32 |
32 |
u64 |
64 | i64 |
64 |
f32 |
32 | f64 |
64 |
bool |
8 | char |
32 |
1.4 Compound types
T ::= ...
| *T -- immutable pointer to T (read-only; C's const T*)
| *mut T -- mutable pointer to T (read-write; C's T*)
| *opaque -- immutable untyped pointer (C's const void*)
| *mut opaque -- mutable untyped pointer (C's void*)
| [T; N] -- array of N elements of type T (N: usize, N ≥ 0)
| struct S -- named struct type (resolved through Σ)
| () -- unit type (no-value / void)
The mut qualifier on a pointer type describes the pointer's access permission
to the pointed-to data. It is orthogonal to binding mutability (§4).
A shorthand used throughout:
- typed pointer:
*Tor*mut T(pointee type is known) - opaque pointer:
*opaqueor*mut opaque(pointee type is erased) - mutable pointer:
*mut Tor*mut opaque - immutable pointer:
*Tor*opaque
1.5 Function types (internal use)
Function types do not appear in user-facing type annotations but are used internally by the type checker when resolving calls.
FnSig ::= (T₁, …, Tₙ) -> T_ret
Open questions — resolved
()as first-class type:()may appear explicitly in-> ()return-type annotations and also arises implicitly when a function omits its return type. It is a valid type in⊢ T wf(Wf-Unit, §2).- Pointer mutability: settled in §4. Pointer mutability (
*Tvs*mut T) is orthogonal to binding mutability and carried in the type itself.
2. Well-Formedness of Types
A type is well-formed (written ⊢ T wf) if every struct name it references is
declared and all array sizes are non-negative. Well-formedness is checked against the
struct table Σ; pointer and array types impose structural constraints. Recursive struct
cycles through direct fields (without pointer indirection) are rejected here — the
rule is deferred to §10 where the full Σ is available.
2.1 Inference rules
(Wf-Prim) ⊢ T wf for any T ∈ { u8, u16, u32, u64,
i8, i16, i32, i64,
f32, f64, bool, char }
(Wf-Unit) ⊢ () wf
(Wf-PtrImm) ⊢ T wf
───────────────
⊢ *T wf
(Wf-PtrMut) ⊢ T wf
───────────────
⊢ *mut T wf
(Wf-OpaqueImm) ⊢ *opaque wf
(Wf-OpaqueMut) ⊢ *mut opaque wf
(Wf-Array) ⊢ T wf N ≥ 0
──────────────────────
⊢ [T; N] wf
(Wf-Struct) S ∈ dom(Σ)
──────────────────────
⊢ struct S wf
Note on Wf-Struct: this rule only checks that the struct name is declared. The
requirement that all field types are themselves well-formed, and that the struct has no
infinite-size cycles, is established when the struct definition is checked (§10).
2.2 Size finiteness (informal)
A type is sized iff its in-memory representation has a fixed, finite byte count
determinable at compile time. All types in §2.1 are sized, except a struct S whose
definition contains a direct (non-pointer) field of a type that transitively includes S
again. Such cycles are detected in §10 and reported as E0900.
Pointer types (*T, *mut T, *opaque, *mut opaque) always have a fixed pointer-width
size regardless of the pointee. Wrapping a recursive struct in a pointer breaks the cycle:
struct Node { next: *Node } -- OK: *Node has finite size
struct Bad { inner: Bad } -- error E0900: infinite size
3. Type Environments
The type environment Γ tracks every variable binding that is in scope at a given
program point. It is an ordered association list — the most recently introduced
binding for a given name takes precedence.
3.1 Grammar
m ::= imm | mut -- binding mutability
Γ ::= ε -- empty environment
| Γ, x : (T, m) -- extend with binding x of type T and mutability m
Bindings are appended on the right. Lookup always returns the rightmost (most recent) entry for a given name, implementing latest-binding-wins semantics.
3.2 Lookup
(Lookup-Hit) ────────────────────────
(Γ, x:(T,m))(x) = (T, m)
(Lookup-Skip) x ≠ y Γ(x) = (T, m)
─────────────────────────────────────
(Γ, y:(T',m'))(x) = (T, m)
Attempting to look up a name not present in Γ is undefined; the type checker must
report error E0100.
3.3 Scope and block exit
Scopes are not represented as a separate data structure — they arise naturally from how
the statement rules in §8 thread Γ through blocks.
A block { s₁; s₂; …; sₙ } is checked by extending Γ sequentially with each
statement's bindings, yielding a final Γ_out. On block exit, Γ_out is
discarded and the environment reverts to Γ (the environment at block entry).
This is written:
Γ ⊢ block ✓ ⇒ Γ_out but the caller uses only Γ, not Γ_out
The formal rule (T-Block) is given in §8.
3.4 Shadowing and overriding
Flux adopts Rust-style re-binding semantics: a let declaration always creates a
new binding regardless of whether the name already exists. The behaviour depends on
where the prior binding lives:
Case A — override (same scope)
Two let x declarations within the same block. Both bindings exist in the environment;
lookup returns the second (newer) one. The first is permanently inaccessible for the
remainder of that scope and any nested scopes.
// scope S
let x: i32 = 1; // Γ₁ = Γ, x:(i32, imm)
let x: f64 = 2.0; // Γ₂ = Γ₁, x:(f64, imm)
// = Γ, x:(i32,imm), x:(f64,imm)
// Γ₂(x) = (f64, imm) ← i32 binding is gone
On exit from scope S, both entries are discarded.
The type of a re-bound name is allowed to change (e.g. i32 → f64 above). This is
intentional: re-binding is a distinct operation from mutation.
Case B — shadow (nested scope)
An inner let x in a nested block, where x is already bound in an enclosing scope.
The inner binding shadows the outer one inside the nested block. When the nested block
exits, the outer binding is restored.
let x: i32 = 1; // outer: Γ₁ = Γ, x:(i32, imm)
{
let x: f64 = 2.0; // inner: Γ₂ = Γ₁, x:(f64, imm)
// Γ₂(x) = (f64, imm) ← outer i32 is shadowed
}
// back to Γ₁: Γ₁(x) = (i32, imm) ← restored
Summary
| Situation | Old binding accessible? | After scope exit |
|---|---|---|
| Override (same scope) | No — immediately hidden | Both removed |
| Shadow (inner scope) | No — hidden in inner | Outer restored |
Both cases are modelled identically in the formal system — the environment is extended with a new entry and lookup finds the newest. The difference is entirely about which entries are removed when a scope exits:
- Block scope: on exit, all bindings introduced inside that block are removed.
- Because scope exit is wholesale, an overriding binding in scope S is removed at the same time as the overridden one — neither survives past S.
3.5 Interaction with Σ and Φ
Struct names (Σ) and function names (Φ) occupy separate namespaces from variable
bindings (Γ). A variable foo and a function foo may coexist; the expression typing
rules in §7 use Γ to resolve identifiers used as values and Φ to resolve calls. A
name used as a call target (f(…)) is looked up in Φ; a name used as a plain value is
looked up in Γ. There is no shadowing between these namespaces.
4. Mutability
Flux has two orthogonal axes of mutability:
- Binding mutability — whether a name can be rebound (assigned a new value).
Declared with
letvs.let mut. - Pointer mutability — whether the data pointed to can be written through a pointer.
Declared as
*T(immutable) vs.*mut T(mutable).
These axes are fully independent; any combination is valid (see §1.4 table).
4.1 Binding mutability
A binding is introduced by let and associates a name with a storage location.
Γ, x : (T, imm) introduced by let x [: T] [= e] ;
Γ, x : (T, mut) introduced by let mut x [: T] [= e] ;
Function parameters follow the same rule: [mut] name : T in a parameter list.
Mutability on a parameter is local — the callee may freely rebind it; the caller
never observes the difference.
4.2 Place expressions
A place (lvalue) is an expression that denotes a memory location that can be read from or written to. Not every expression is a place.
place ::= x -- variable
| * e -- pointer dereference (e must have a pointer type)
| e . f -- struct field access (e must be a place or struct value)
| e [ i ] -- array subscript (e must be a place or array value)
All other expression forms (literals, arithmetic, calls, etc.) are not places.
4.3 Mutable places
A place is mutable if it can be the target of an assignment. The predicate
Γ ⊢ e mutable-place is defined by the following rules:
(MP-Var) Γ(x) = (T, mut)
─────────────────────────────
Γ ⊢ x mutable-place
(MP-Deref) Γ ⊢ e : *mut T
─────────────────────────────
Γ ⊢ *e mutable-place
(MP-Field) Γ ⊢ e mutable-place Σ(S).f = T
─────────────────────────────────────
Γ ⊢ e.f mutable-place
(MP-Index) Γ ⊢ e mutable-place
─────────────────────────────
Γ ⊢ e[i] mutable-place
Note the asymmetry in MP-Deref: dereferencing an immutable pointer *T yields a
place (it can be read) but not a mutable place (it cannot be written). There is no
MP-Deref rule for *T.
4.4 Readable places
Dereferencing any pointer (mutable or not) produces a place that can be read.
The predicate Γ ⊢ e place is a relaxed version of mutable-place used in expression
typing (§7) for the * unary operator:
(P-Var) x ∈ dom(Γ)
─────────────
Γ ⊢ x place
(P-Deref-Imm) Γ ⊢ e : *T
─────────────────
Γ ⊢ *e place (readable, but not mutable-place)
(P-Deref-Mut) Γ ⊢ e : *mut T
─────────────────
Γ ⊢ *e place (readable AND mutable-place via MP-Deref)
(P-Field) Γ ⊢ e place Σ(S).f = T
────────────────────────────
Γ ⊢ e.f place
(P-Index) Γ ⊢ e place
─────────────────
Γ ⊢ e[i] place
4.5 Address-of and pointer mutability
The unary & operator produces a pointer. The mutability of the resulting pointer
is determined by the mutability of the place it is applied to:
(Addrof-Imm) Γ ⊢ e place ¬(Γ ⊢ e mutable-place)
────────────────────────────────────────────
Γ ⊢ &e : *T
(Addrof-Mut) Γ ⊢ e mutable-place
────────────────────────────────────────────
Γ ⊢ &e : *mut T
Taking the address of a mutable place yields a *mut T; taking the address of a
read-only place yields a *T. This ensures that a *mut T cannot be obtained from a
non-mutable binding via &.
4.6 Interaction summary
| Source form | Binding mut? | Ptr mut? | Can rebind p? |
Can write *p? |
|---|---|---|---|---|
let p: *T = &x |
no | no | no | no |
let mut p: *T = &x |
yes | no | yes | no |
let p: *mut T = &mut x |
no | yes | no | yes |
let mut p: *mut T = &mut x |
yes | yes | yes | yes |
4.7 Pointer subtyping
A mutable pointer may be used where an immutable pointer is expected (covariant in the immutable direction). The reverse is not permitted.
(Ptr-Coerce) Γ ⊢ e : *mut T
──────────────────
Γ ⊢ e : *T -- *mut T is a subtype of *T
This is consistent with the promotion relation (§5): *mut T ⊑ *T. There is no
implicit coercion in the other direction; an explicit cast is required to obtain
*mut T from *T.
5. Numeric Type Hierarchy and Coercions
Flux supports implicit widening promotion for numeric types. Promotions are one-directional (widening only), never cross a category boundary, and never lose information.
5.1 Rank
Each numeric type has an integer rank that reflects its bit width within its category.
| Type | rank |
Type | rank |
|---|---|---|---|
u8 |
1 | i8 |
1 |
u16 |
2 | i16 |
2 |
u32 |
3 | i32 |
3 |
u64 |
4 | i64 |
4 |
f32 |
1 | f64 |
2 |
bool and char have no rank; they are not members of Numeric.
5.2 Promotion predicate T ⊑ U
T ⊑ U holds — meaning a value of type T may be implicitly coerced to type U —
exactly when one of the following rules applies:
(Refl) T ⊑ T for any T
(Unsigned) T ⊑ U if category(T) = category(U) = Unsigned
and rank(T) < rank(U)
(Signed) T ⊑ U if category(T) = category(U) = Signed
and rank(T) < rank(U)
(Float) f32 ⊑ f64
(Char) char ⊑ U if category(U) = Unsigned
and rank(U) ≥ 3 -- u32 or u64
All other combinations are forbidden. In particular:
- Unsigned → Signed is forbidden (even if the target is wider).
- Signed → Unsigned is forbidden.
- Any integer → Float is forbidden.
- Float → integer is forbidden.
- Narrowing (rank decreases) is forbidden in both directions.
char→ Signed is forbidden.charmay only promote tou32oru64(rank ≥ 3 unsigned).
5.3 char and u32
char is represented as a 32-bit unsigned scalar. It is not a synonym for u32 —
they are distinct types — but a char may be implicitly promoted to u32 or u64
when the context requires it (e.g. as an operand to integer arithmetic).
The reverse (u32 → char) is never implicit; it requires an explicit cast, since not
every u32 value is a valid Unicode scalar.
5.4 Common type common(T, U)
For binary operators that require matching operand types, the common type of T and
U is defined as their least upper bound under ⊑:
common(T, U) = T if U ⊑ T
common(T, U) = U if T ⊑ U
common(T, U) = ⊥ otherwise -- type error: no common type
When T ⊑ U and U ⊑ T simultaneously (which can only occur when T = U by Refl),
common(T, U) = T = U.
Examples:
T |
U |
common(T, U) |
Notes |
|---|---|---|---|
u8 |
u32 |
u32 |
u8 promoted to u32 |
u32 |
u8 |
u32 |
symmetric |
i16 |
i64 |
i64 |
i16 promoted to i64 |
f32 |
f64 |
f64 |
f32 promoted to f64 |
char |
u32 |
u32 |
char promoted as u32 equivalent |
char |
u64 |
u64 |
char promoted to u64 |
u8 |
i32 |
⊥ |
error: cross-signedness promotion |
i32 |
u64 |
⊥ |
error: signed → unsigned forbidden |
u32 |
f32 |
⊥ |
error: integer → float forbidden |
char |
u16 |
⊥ |
error: rank(u16)=2 < 3, below u32 |
char |
i32 |
⊥ |
error: char → signed forbidden |
5.5 Literal type inference
Integer and float literals have no intrinsic type. Their type is determined by:
-
Contextual typing (checked mode): if the surrounding context expects a specific numeric type
T, the literal is given typeTdirectly, without promotion. The literal value must be representable inT(range check). -
Default type (synthesised mode): when no context is available, literals are given a default type:
INT_LIT→i32FLOAT_LIT→f64
These defaults apply only when type inference cannot determine a more specific type.
5.6 Explicit casts
Conversions not covered by ⊑ require an explicit cast (narrowing, cross-signedness,
integer↔float, u32→char, etc.). The cast syntax and its rules are not yet in the grammar
and will be specified separately.
6. Operator Typing
Each operator has one or more typing schemes describing the types it accepts and the type it produces.
6.1 Binary operators
For operators that accept two numeric operands, the operands are first coerced to their
common(T₁, T₂) type (§5.4) before the operation is performed. If common is ⊥ the
expression is a type error.
| Operators | Required operand types | Result type | Notes |
|---|---|---|---|
+ - * / % |
T₁, T₂ where common(T₁, T₂) ∈ Numeric |
common(T₁, T₂) |
Both operands promoted to common type |
<< >> |
LHS T any integer; RHS U any unsigned integer |
T |
Only LHS type determines the result; no promotion of the LHS; RHS must be unsigned (see §6.1.1) |
& | ^ |
T₁, T₂ where common(T₁, T₂) ∈ Integer |
common(T₁, T₂) |
Both operands promoted to common integer type |
and or |
bool, bool |
bool |
Short-circuit; no promotion applies |
== != |
numeric/bool/char: common(T₁,T₂) ∈ Numeric ∪ {bool, char}; or pointer equality (see §6.1.2) |
bool |
Structs and arrays are not comparable |
< > <= >= |
T₁, T₂ where common(T₁, T₂) ∈ Numeric |
bool |
Both operands promoted to common type |
= |
LHS: mutable place of type T; RHS: T₂ where T₂ ⊑ T |
T |
Result is the assigned value; enables chaining (see §6.1.3) |
op= |
Same as = after desugaring lhs op= rhs → lhs = lhs op rhs |
T |
LHS must be a mutable place |
6.1.1 Shift operator detail
a << b and a >> b use asymmetric typing:
amay be any integer typeT;Tis also the result type.bmust be any unsigned integer type (u8,u16,u32,u64). Signed or other types are rejected (error E0401).- The two operands are never subject to
common— LHS and RHS types are checked independently. The LHS is never promoted to match the RHS.
(T-Shl / T-Shr) Γ ⊢ a : T T ∈ Integer
Γ ⊢ b : U U ∈ Unsigned
────────────────────────────────
Γ ⊢ a << b : T (resp. a >> b : T)
Rationale: shifting a u8 value by a u32 amount is idiomatic and well-defined.
Requiring both operands to be the same type would be unnecessarily restrictive. Requiring
the amount to be unsigned makes negative shift amounts a compile-time error rather than
undefined behaviour.
6.1.2 Pointer equality
Two pointer expressions may be compared with == and != to test whether they point to
the same address. Pointer comparison is type-safe: the two operands must have a common
pointer type after applying the Ptr-Coerce rule (§4.7).
(T-PtrEq) Γ ⊢ e₁ : P₁ Γ ⊢ e₂ : P₂ ptr-common(P₁, P₂) = P
───────────────────────────────────────────────────────────
Γ ⊢ e₁ == e₂ : bool (resp. e₁ != e₂ : bool)
Where ptr-common is defined as:
ptr-common(*mut T, *mut T) = *mut T
ptr-common(*mut T, *T) = *T -- Ptr-Coerce applied to LHS
ptr-common(*T, *mut T) = *T -- Ptr-Coerce applied to RHS
ptr-common(*T, *T) = *T
ptr-common(*mut opaque, *mut opaque) = *mut opaque
ptr-common(*mut opaque, *opaque) = *opaque
ptr-common(*opaque, *mut opaque) = *opaque
ptr-common(*opaque, *opaque) = *opaque
ptr-common(_, _) = ⊥ -- mixed typed/opaque or different pointees
Comparing a typed pointer *T with an opaque pointer *opaque is not permitted
implicitly; an explicit cast is required. Comparing *T with *U where T ≠ U is also
not permitted.
Ordering operators (<, >, <=, >=) are not defined for pointer types.
6.1.3 Assignment as an expression
The = operator returns the assigned value with the LHS type T. This makes
assignments first-class expressions and enables chained assignment:
a = b = 0 -- parses as a = (b = 0) (right-associative)
Evaluation proceeds right-to-left:
b = 0— assigns0tob(typeT_b), result is0 : T_b.a = (0 : T_b)— requiresT_b ⊑ T_a; assigns toa, result is the valueT_a.
The formal rule:
(T-Assign) Γ ⊢ e₁ mutable-place Γ ⊢ e₁ : T
Γ ⊢ e₂ : T₂ T₂ ⊑ T
─────────────────────────────────────
Γ ⊢ e₁ = e₂ : T
The compound assignment e₁ op= e₂ desugars before typing:
(T-CompoundAssign) e₁ op= e₂ ≡ e₁ = e₁ op e₂
After desugaring, T-Assign applies. The LHS expression e₁ is evaluated only once at
runtime (the desugaring is semantic, not textual); both uses of e₁ refer to the same
place.
6.2 Unary operators
| Operator | Accepted type | Result type | Notes |
|---|---|---|---|
- |
any numeric T |
T |
Arithmetic negation |
! |
bool |
bool |
Logical NOT |
~ |
any integer T |
T |
Bitwise complement |
* |
*T or *mut T |
T |
Dereference; yields a readable place (§4.4) |
& |
readable place of type T |
*T |
Address-of immutable place (Addrof-Imm, §4.5) |
& |
mutable place of type T |
*mut T |
Address-of mutable place (Addrof-Mut, §4.5) |
The two & rows are not overloading — the single & operator produces *mut T or *T
depending on whether its operand is a mutable place (§4.5). The unary * operator
applied to *T yields a readable-but-not-mutable place; applied to *mut T it yields a
readable-and-mutable place (rules P-Deref-Imm / P-Deref-Mut, §4.4).
7. Expression Typing
Judgement form: Γ, Σ, Φ ⊢ e : T (written Γ ⊢ e : T when the tables are clear from
context).
7.1 Typing strategy
Expression typing uses synthesis (bottom-up inference): the type of an expression is
produced from its sub-expressions; no expected type is pushed down from the enclosing
context. The single exception is at a let binding with an explicit type annotation,
where the annotation is used to give numeric literals a more specific type than their
default (§5.5). This is handled in T-Let (§8); the rules in this section always
synthesise.
7.2 Built-in: string_view
String literals produce a value of the built-in struct type string_view, which is
pre-loaded into Σ before any user code is checked:
string_view ::= struct { data: *char, size: u64 }
The name string_view is reserved — a user-defined struct with this name is a type
error. A STRING_LIT token is compiled to a string_view value whose data field
points to the UTF-8 bytes of the string and size holds the byte length.
Future: when Flux gains slice types,
string_viewwill be replaced or superseded by a first-class slice[char].
7.3 Inference rules
Literals and booleans
(T-IntLit) ─────────────────────────
Γ ⊢ INT_LIT : i32
-- default type; T-Let may supply a contextual integer type (§5.5, §8)
(T-FloatLit) ─────────────────────────
Γ ⊢ FLOAT_LIT : f64
-- default type; T-Let may supply a contextual float type (§5.5, §8)
(T-StrLit) ─────────────────────────────────────────
Γ ⊢ STRING_LIT : struct string_view
(T-CharLit) ─────────────────────────
Γ ⊢ CHAR_LIT : char
(T-True) ─────────────────────────
Γ ⊢ true : bool
(T-False) ─────────────────────────
Γ ⊢ false : bool
Variable
(T-Var) Γ(x) = (T, m)
─────────────────
Γ ⊢ x : T
x may be any binding mutability m; the type is T regardless. Mutability is only
consulted when x is the LHS of an assignment (MP-Var, §4.3).
Grouping
(T-Paren) Γ ⊢ e : T
─────────────
Γ ⊢ (e) : T
Unary operators
(T-Neg) Γ ⊢ e : T T ∈ Numeric
─────────────────────────────
Γ ⊢ -e : T
(T-Not) Γ ⊢ e : bool
──────────────
Γ ⊢ !e : bool
(T-BitNot) Γ ⊢ e : T T ∈ Integer
─────────────────────────────
Γ ⊢ ~e : T
(T-Deref) Γ ⊢ e : *T (also valid for *mut T, by Ptr-Coerce §4.7)
─────────────────────────────
Γ ⊢ *e : T
(T-AddrOf) -- see §4.5: yields *T or *mut T depending on place mutability
Γ ⊢ e place (readable place)
─────────────────────────────────────────
Γ ⊢ &e : *T if ¬(Γ ⊢ e mutable-place) [Addrof-Imm]
Γ ⊢ &e : *mut T if Γ ⊢ e mutable-place [Addrof-Mut]
Binary operators
All binary operator rules follow the typing schemes in §6.1. Concisely:
(T-Arith) Γ ⊢ e₁ : T₁ Γ ⊢ e₂ : T₂
C = common(T₁, T₂) C ∈ Numeric
────────────────────────────────────
Γ ⊢ e₁ {+,-,*,/,%} e₂ : C
(T-Shift) Γ ⊢ e₁ : T T ∈ Integer
Γ ⊢ e₂ : U U ∈ Unsigned
─────────────────────────────
Γ ⊢ e₁ {<<,>>} e₂ : T
(T-BitBin) Γ ⊢ e₁ : T₁ Γ ⊢ e₂ : T₂
C = common(T₁, T₂) C ∈ Integer
────────────────────────────────────
Γ ⊢ e₁ {&,|,^} e₂ : C
(T-Logic) Γ ⊢ e₁ : bool Γ ⊢ e₂ : bool
───────────────────────────────────
Γ ⊢ e₁ {and,or} e₂ : bool
(T-CmpNum) Γ ⊢ e₁ : T₁ Γ ⊢ e₂ : T₂
common(T₁, T₂) ∈ Numeric ∪ {bool, char}
───────────────────────────────────────────
Γ ⊢ e₁ {==,!=} e₂ : bool
(T-CmpPtr) Γ ⊢ e₁ : P₁ Γ ⊢ e₂ : P₂ ptr-common(P₁, P₂) ≠ ⊥ (§6.1.2)
───────────────────────────────────────────────────────────
Γ ⊢ e₁ {==,!=} e₂ : bool
(T-Ord) Γ ⊢ e₁ : T₁ Γ ⊢ e₂ : T₂
common(T₁, T₂) ∈ Numeric
──────────────────────────────
Γ ⊢ e₁ {<,>,<=,>=} e₂ : bool
(T-Assign) Γ ⊢ e₁ mutable-place Γ ⊢ e₁ : T
Γ ⊢ e₂ : T₂ T₂ ⊑ T
─────────────────────────────────────
Γ ⊢ e₁ = e₂ : T (§6.1.3)
(T-CompAssign) e₁ op= e₂ desugars to e₁ = e₁ op e₂; apply T-Assign (§6.1.3)
Postfix: field access
(T-Field) Γ ⊢ e : struct S Σ(S).f = T
─────────────────────────────────
Γ ⊢ e.f : T
If f is not a field of S, error E0503. If e does not have a struct type, error E0502.
Postfix: array index
(T-Index) Γ ⊢ e : [T; N] Γ ⊢ i : U U ∈ Unsigned
──────────────────────────────────────────────
Γ ⊢ e[i] : T
If e is not an array type, error E0600. If i is not an unsigned integer, error E0601.
Postfix: function call
Functions are looked up in the function table Φ, separate from variable bindings.
The callee of a call expression must be a bare identifier (function pointers are not yet
supported).
(T-FuncName) Φ(f) = (T₁, …, Tₙ) -> T_ret
──────────────────────────────────────────────────────
Γ ⊢ f : FnSig(T₁, …, Tₙ) -> T_ret (internal type; §1.5)
(T-Call) Γ ⊢ e_f : FnSig(T₁, …, Tₙ) -> T_ret
Γ ⊢ a₁ : U₁ U₁ ⊑ T₁
⋮
Γ ⊢ aₙ : Uₙ Uₙ ⊑ Tₙ
──────────────────────────────────────────────────────
Γ ⊢ e_f(a₁, …, aₙ) : T_ret
If the argument count does not match the parameter count, error E0205. If argument type
Uᵢ is not promotable to Tᵢ, error E0204. If the name is not in Φ, error E0102.
Struct literal
(T-StructLit) S ∈ dom(Σ)
Σ(S) = { f₁: T₁, …, fₙ: Tₙ } (n fields, in declaration order)
the literal supplies fields {g₁, …, gₖ} (in any order)
{g₁, …, gₖ} = {f₁, …, fₙ} (same set; field order is irrelevant)
∀ i ∈ 1..n. Γ ⊢ e_{gᵢ} : Uᵢ ∧ Uᵢ ⊑ Tᵢ
──────────────────────────────────────────────────────────────────
Γ ⊢ S { g₁: e₁, …, gₙ: eₙ } : struct S
The set equality {g₁, …, gₖ} = {f₁, …, fₙ} is strict:
- A field present in
Σ(S)but absent from the literal → error E0500 (missing field). - A field present in the literal but absent from
Σ(S)→ error E0501 (extra field). - A field listed twice in the literal → error E0501 (duplicate counts as extra).
8. Statement Typing
8.1 Judgment form
Γ, A, R, L ⊢ s ✓ ⇒ Γ', A', cf
| Component | Meaning |
|---|---|
Γ |
Type environment (maps names to (T, m)); same as §3 |
A |
Definitely-assigned set — A ⊆ dom(Γ); names provably initialised on all paths so far |
R |
Expected return type of the enclosing function |
L |
Loop context: L ∈ { noloop, inloop } |
Γ' |
Output type environment (may have new bindings from let) |
A' |
Output definitely-assigned set |
cf |
Control-flow outcome: cf ∈ { normal, diverges } |
cf = diverges means the statement never falls through to the next statement (because of return, break, or continue). cf = normal means it may fall through.
T-Var from §7.3 is refined to require definite assignment:
(T-Var) Γ(x) = (T, m) x ∈ A
──────────────────────────
Γ, A ⊢ x : T
If x ∉ A (declared but not yet definitely assigned), it is a type error (use-before-init, error E0100 extended).
8.2 Definite-assignment join
When two control-flow paths merge (in if-else), the output A is the intersection
of the two branch sets, intersected with the outer A:
join_A(A_then, A_else, A) = A ∪ (A_then ∩ A_else)
A variable becomes definitely assigned after the merge only if it was assigned in both branches. Variables assigned in only one branch remain unassigned after the merge.
For conditional statements without an else (or while/loop with uncertain iterations),
the output A is conservatively A (the input set, unchanged):
join_A(A_branch, A) = A -- for single-branch paths
8.3 Unreachable code
Any statement that immediately follows a statement with cf = diverges is unreachable.
Unreachable statements are still type-checked (to catch errors in dead code) but the
compiler emits warning W001 for the first unreachable statement in each run of dead
code.
8.4 Loop context
L is a single flag that threads through every statement rule unchanged, except that
while and loop bodies are checked with L = inloop regardless of the outer L.
L_body = inloop (when checking body of while/loop)
L_inner = L (for if/else/block — L passes through unchanged)
This makes break and continue valid anywhere inside a loop body, regardless of how
many intermediate if, else, or inner blocks separate them from the while/loop
keyword. Crossing a function boundary resets L = noloop at the function level (§9).
8.5 Statement rules
T-Let
(T-Let-Infer) Γ, A ⊢ e : T (T synthesised; no annotation)
─────────────────────────────────────────────────────────────────
Γ, A, R, L ⊢ let [mut] x = e ; ✓ ⇒ Γ, x:(T,m) A∪{x} normal
(T-Let-Ann-Init) ⊢ T wf
Γ, A ⊢ e : U U ⊑ T
-- if e is INT_LIT/FLOAT_LIT and T is numeric: U = T directly (§5.5)
─────────────────────────────────────────────────────────────────
Γ, A, R, L ⊢ let [mut] x : T = e ; ✓ ⇒ Γ, x:(T,m) A∪{x} normal
(T-Let-Ann-NoInit) ⊢ T wf
─────────────────────────────────────────────────────────────────
Γ, A, R, L ⊢ let [mut] x : T ; ✓ ⇒ Γ, x:(T,m) A normal
-- x is NOT added to A; use before assignment → E0100
No annotation and no initialiser → error E1000.
In all T-Let rules, m = imm if no mut keyword, m = mut if mut is present.
T-Return
(T-Return-Val) Γ, A ⊢ e : U U ⊑ R
─────────────────────────────────────────────────────
Γ, A, R, L ⊢ return e ; ✓ ⇒ Γ A diverges
(T-Return-Unit) R = ()
─────────────────────────────────────────────────────
Γ, A, R, L ⊢ return ; ✓ ⇒ Γ A diverges
T-ExprStmt
(T-ExprStmt) Γ, A ⊢ e : T
A' = A ∪ { x | x ∈ assigns(e) ∩ dom(Γ) }
─────────────────────────────────────────────────────
Γ, A, R, L ⊢ e ; ✓ ⇒ Γ A' normal
Where assigns(e) collects all variables directly assigned anywhere in e:
assigns(x = e') = {x} ∪ assigns(e')
assigns(x op= e') = assigns(e') -- x already def (op uses it)
assigns(f(e₁,…)) = ∪ assigns(eᵢ)
assigns(S { f₁: e₁, …, fₙ: eₙ }) = ∪ assigns(eᵢ)
assigns(e.f) = assigns(e)
assigns(e[i]) = assigns(e) ∪ assigns(i)
assigns((e)) = assigns(e)
assigns(op e) = assigns(e)
assigns(e₁ binop e₂) = assigns(e₁) ∪ assigns(e₂)
assigns(literal/ident) = ∅
T-If
(T-If-Then) Γ, A ⊢ cond : bool
Γ, A, R, L ⊢ { then_block } ✓ ⇒ _, A_then, _
─────────────────────────────────────────────────────────────────────
Γ, A, R, L ⊢ if cond { then_block } ✓ ⇒ Γ A normal
-- no else: A unchanged (branch may not execute)
(T-If-Else) Γ, A ⊢ cond : bool
Γ, A, R, L ⊢ { then_block } ✓ ⇒ _, A_then, cf_t
Γ, A, R, L ⊢ else_branch ✓ ⇒ _, A_else, cf_e
─────────────────────────────────────────────────────────────────────
Γ, A, R, L ⊢ if cond { then_block } else else_branch ✓
⇒ Γ join_A(A_then, A_else, A) join_cf(cf_t, cf_e)
else_branch is either an if statement (else-if chain) or a block { … }.
join_cf(diverges, diverges) = diverges
join_cf(normal, _) = normal
join_cf(_, normal) = normal
The Γ output of T-If discards any bindings introduced inside the branches (scope exit,
§3.3); only the outer Γ is returned.
T-While
(T-While) Γ, A ⊢ cond : bool
Γ, A, R, inloop ⊢ { body } ✓ ⇒ _, _, _
─────────────────────────────────────────────────────────────────────
Γ, A, R, L ⊢ while cond { body } ✓ ⇒ Γ A normal
The body is checked with L = inloop. The output A is conservatively A (the loop
may execute zero times). cf = normal (the loop may terminate when the condition
becomes false).
T-Loop
(T-Loop) Γ, A, R, inloop ⊢ { body } ✓ ⇒ _, _, _
─────────────────────────────────────────────────────────────────────
Γ, A, R, L ⊢ loop { body } ✓ ⇒ Γ A normal
cf = normal conservatively (a break inside exits the loop). Output A is A
(the body may be bypassed by an immediate break).
T-Break / T-Continue
(T-Break) L = inloop
─────────────────────────────────────────
Γ, A, R, L ⊢ break ; ✓ ⇒ Γ A diverges
(T-Continue) L = inloop
─────────────────────────────────────────
Γ, A, R, L ⊢ continue ; ✓ ⇒ Γ A diverges
If L = noloop, break → E0800, continue → E0801.
T-Block
A block { s₁; s₂; …; sₙ } is checked by chaining statements sequentially. Bindings
introduced inside are removed on exit (scope restriction, §3.3).
(T-Block-Empty) ───────────────────────────────────────────────────────────────
Γ, A, R, L ⊢ { } ✓ ⇒ Γ A normal
(T-Block-Step) Γ, A, R, L ⊢ s₁ ✓ ⇒ Γ₁, A₁, normal
Γ₁, A₁, R, L ⊢ { s₂; …; sₙ } ✓ ⇒ Γₙ, Aₙ, cf
───────────────────────────────────────────────────────────────
Γ, A, R, L ⊢ { s₁; s₂; …; sₙ } ✓ ⇒ restrict(Γₙ, Γ) restrict_A(Aₙ, Γ) cf
(T-Block-Dead) Γ, A, R, L ⊢ s₁ ✓ ⇒ Γ₁, A₁, diverges
-- s₂, …, sₙ are unreachable; emit W001; still type-check them
Γ₁, A₁, R, L ⊢ { s₂; …; sₙ } ✓ ⇒ _, _, _ (results discarded)
───────────────────────────────────────────────────────────────
Γ, A, R, L ⊢ { s₁; s₂; …; sₙ } ✓ ⇒ restrict(Γ₁, Γ) restrict_A(A₁, Γ) diverges
Where:
restrict(Γ', Γ)removes all bindings inΓ'that are not indom(Γ)(inner let bindings).restrict_A(A', Γ)removes all names fromA'that are not indom(Γ).
9. Function Definitions
Judgement form: Σ, Φ ⊢ func_def ✓
The function table Φ is pre-populated with the signatures of all functions in the
program before any body is checked (first pass, §11). This gives every function access
to its own name for recursion and to any other function regardless of declaration order.
9.1 Building the initial environment
Given a function fn f(p₁, …, pₙ) [-> R] { body } where each parameter is
[mut] xᵢ : Tᵢ:
Γ_init = x₁:(T₁, m₁), …, xₙ:(Tₙ, mₙ) -- ordered; rightmost wins for lookup
A_init = { x₁, …, xₙ } -- all params are definitely assigned on entry
where mᵢ = mut if mut is written, mᵢ = imm otherwise.
All parameter types must be well-formed: ⊢ Tᵢ wf for each i.
Duplicate parameter names are an error (E0902): all names x₁, …, xₙ must be
pairwise distinct. The check is performed before the environment is built.
9.2 Return type
The declared return type R is taken from the -> T annotation. If the annotation is
absent, R = () (unit; the function returns no value).
9.3 All-paths-return
The function body is checked as a block statement with L = noloop. The resulting
control-flow outcome cf determines whether the return requirement is satisfied:
| Return type | Required cf |
Otherwise |
|---|---|---|
R = () |
normal or diverges |
— always OK |
R ≠ () |
diverges |
error E1001 |
When R = () and the body falls through (cf = normal), the function implicitly
returns () at end of body — no explicit return is needed. An explicit return;
(bare, no value) is also allowed anywhere in a () function (T-Return-Unit, §8).
When R ≠ (), every control-flow path through the body must end with an explicit
return e; where e : U and U ⊑ R. If the body's cf = normal (some path falls
through without a return), error E1001 is reported.
Note on loop conservatism: while and loop statements always output cf = normal
(§8.4). A function body that ends with a loop { return e; } and no following return
will fail the all-paths check (cf = normal from the loop). Add an explicit return
after the loop to satisfy the checker, or use if-else chains that both return.
9.4 Formal rule
(T-FuncDef) f ∈ dom(Φ) Φ(f) = (T₁, …, Tₙ) -> R -- f pre-loaded by §11 first pass
∀ i. ⊢ Tᵢ wf -- param types well-formed
all parameter names x₁, …, xₙ distinct -- else E0902
Γ_init = x₁:(T₁,m₁), …, xₙ:(Tₙ,mₙ)
A_init = { x₁, …, xₙ }
Γ_init, A_init, R, noloop ⊢ { body } ✓ ⇒ _, _, cf
R = () or cf = diverges -- else E1001
──────────────────────────────────────────────────────────────────────────
Σ, Φ ⊢ fn f(params) [-> R] { body } ✓
9.5 Recursion
Direct recursion (f calling f) and mutual recursion (f calling g, g calling
f) are both permitted. Because Φ is fully populated before any body is checked
(§11), T-FuncName (§7.3) can resolve any call to f inside f's own body without
any special treatment in T-FuncDef.
10. Struct Definitions
Judgement form: Σ ⊢ struct_def ✓
Σ is pre-populated with all struct names in the program before any struct body is
checked (first pass, §11). This allows forward references and mutually recursive pointer
types.
10.1 Size-dependency and cycles
Define struct-names(T) — the set of struct names that appear directly (not behind
a pointer) in type T:
struct-names(struct S) = { S }
struct-names([T; N]) = struct-names(T) -- array has same direct dependency as elem
struct-names(*T) = ∅ -- pointer indirection breaks size dependency
struct-names(*mut T) = ∅
struct-names(*opaque) = ∅
struct-names(*mut opaque) = ∅
struct-names(primitive) = ∅
struct-names(()) = ∅
Define the direct-uses relation on struct names:
direct-uses(S) = ⋃ { struct-names(Tᵢ) | fᵢ : Tᵢ ∈ Σ(S) }
Let TC(direct-uses) be its transitive closure. A struct S has an infinite-size
cycle if S ∈ TC(direct-uses)(S). Such cycles are forbidden (error E0900).
Examples:
struct Node { next: *Node } -- direct-uses(Node) = ∅ → no cycle ✓
struct Bad { inner: Bad } -- direct-uses(Bad) = {Bad} → cycle E0900
struct A { b: B } -- direct-uses(A) = {B}
struct B { a: *A } -- direct-uses(B) = ∅ → A→B→ (stop); no cycle ✓
struct C { d: D } -- direct-uses(C) = {D}
struct D { c: C } -- direct-uses(D) = {C} → C→D→C: cycle E0900
10.2 Formal rule
(T-StructDef) S ∈ dom(Σ)
Σ(S) = { f₁: T₁, …, fₙ: Tₙ }
all field names f₁, …, fₙ pairwise distinct -- else E0901
∀ i ∈ 1..n. ⊢ Tᵢ wf -- else E0101 / type error
S ∉ TC(direct-uses)(S) -- else E0900
──────────────────────────────────────────────────────────────────────
Σ ⊢ struct S { f₁: T₁, …, fₙ: Tₙ } ✓
An empty struct (n = 0) is valid — it has no fields to check and no cycles.
11. Program Typing
Judgement form: ⊢ program ✓
A program is a sequence of top-level definitions (structs and functions) in any order. Checking proceeds in four passes to support mutual recursion and forward references.
11.1 Passes
Pass 1 — Collect declarations
Scan every top-level definition and populate:
Σ— map from struct name to the sequence of field names/types (bodies not checked yet)Φ— map from function name to its signature(T₁,…,Tₙ) -> R(bodies not checked yet)
Errors detected in pass 1:
- E0103: two
structdefinitions with the same name. - E0104: two
fndefinitions with the same name. - E0101 (forward): a type annotation in a signature or field references a struct name not in
Σ— reported immediately since signatures are parsed in full during pass 1.
Pass 2 — Check struct bodies
For each struct definition, apply T-StructDef (§10.2) under the completed Σ. Errors
checked: E0901 (duplicate field), ⊢ Tᵢ wf (E0101), E0900 (size cycle). The cycle check
is performed globally across all structs using TC(direct-uses).
Pass 3 — Check function bodies
For each function definition, apply T-FuncDef (§9.4) under the completed Σ and Φ.
Errors checked: E0902 (duplicate param), E1001 (missing return), and all expression/
statement errors from §6–§9.
Pass 4 — Entry point (optional)
If the compiler is invoked with --entry <name>:
name ∈ dom(Φ)— else error E0102 (undefined function for entry).- The signature
Φ(name)must be compatible with the entry convention (currently no required signature; any function may serve as entry).
11.2 Formal rule
(T-Program) Σ, Φ built from pass 1 (no duplicates — E0103/E0104)
∀ struct_def in program. Σ ⊢ struct_def ✓ -- pass 2
∀ func_def in program. Σ, Φ ⊢ func_def ✓ -- pass 3
entry ∈ dom(Φ) (if --entry specified) -- pass 4
──────────────────────────────────────────────────────────────────────
⊢ program ✓
12. Error Catalogue
Each error has a code, a category, a message template (using {…} for placeholders),
and the spans to highlight in the diagnostic. Warnings use the W prefix.
12.1 Name errors (E01xx)
| Code | Category | Message template | Span(s) |
|---|---|---|---|
| E0100 | Undefined variable | cannot find value '{name}' in this scope |
use site |
| E0101 | Undefined struct type | cannot find type '{name}' in this scope |
type annotation |
| E0102 | Undefined function | cannot find function '{name}' in this scope |
call site |
| E0103 | Duplicate struct name | struct '{name}' is defined more than once |
second definition |
| E0104 | Duplicate function name | function '{name}' is defined more than once |
second definition |
Note: use of a declared-but-uninitialized variable (§8.1) is reported as E0100 with
the message use of possibly-uninitialized variable '{name}'.
12.2 Type mismatch errors (E02xx)
| Code | Category | Message template | Span(s) |
|---|---|---|---|
| E0200 | Operator type mismatch | operator '{op}' cannot be applied to types '{T}' and '{U}' |
operator |
| E0201 | Assignment type mismatch | cannot assign value of type '{U}' to binding of type '{T}' |
initialiser expr |
| E0202 | Condition not bool | condition must be of type 'bool', found '{T}' |
condition expr |
| E0203 | Return type mismatch | cannot return value of type '{U}' from function returning '{R}' |
return expr |
| E0204 | Argument type mismatch | argument {i} has type '{U}', expected '{T}' |
argument expr |
| E0205 | Wrong argument count | function '{f}' expects {n} argument(s) but {k} were supplied |
call expression |
12.3 Mutability errors (E03xx)
| Code | Category | Message template | Span(s) |
|---|---|---|---|
| E0300 | Assign to immutable | cannot assign to '{name}' because it is not declared as 'mut' |
LHS, binding site |
| E0301 | Assign to non-place | left-hand side of assignment is not a valid place expression |
LHS expr |
12.4 Operator errors (E04xx)
| Code | Category | Message template | Span(s) |
|---|---|---|---|
| E0400 | Operator type mismatch | operator '{op}' requires compatible numeric types, found '{T}' and '{U}' |
operator |
| E0401 | Shift amount not unsigned | shift amount must be an unsigned integer type, found '{U}' |
RHS of shift |
12.5 Struct literal errors (E05xx)
| Code | Category | Message template | Span(s) |
|---|---|---|---|
| E0500 | Missing struct field | missing field '{f}' in initialiser for struct '{S}' |
struct literal |
| E0501 | Extra struct field | struct '{S}' has no field named '{f}' |
field name in literal |
| E0502 | Field on non-struct | type '{T}' has no fields |
field access expr |
| E0503 | No such field | struct '{S}' has no field named '{f}' |
field name in access |
12.6 Index errors (E06xx)
| Code | Category | Message template | Span(s) |
|---|---|---|---|
| E0600 | Index on non-array | type '{T}' cannot be indexed |
index expr |
| E0601 | Index type not unsigned int | array index must be an unsigned integer type, found '{U}' |
index expr |
12.7 Pointer errors (E07xx)
| Code | Category | Message template | Span(s) |
|---|---|---|---|
| E0700 | Deref of non-ptr | type '{T}' cannot be dereferenced |
deref expr |
| E0701 | Addrof non-place | cannot take the address of a temporary value |
addrof expr |
12.8 Control-flow errors (E08xx)
| Code | Category | Message template | Span(s) |
|---|---|---|---|
| E0800 | break outside loop |
'break' used outside of a loop |
break keyword |
| E0801 | continue outside loop |
'continue' used outside of a loop |
continue keyword |
12.9 Struct definition errors (E09xx)
| Code | Category | Message template | Span(s) |
|---|---|---|---|
| E0900 | Recursive struct (no indirection) | struct '{S}' has infinite size due to recursive field '{f}: {T}' |
field type |
| E0901 | Duplicate struct field | field '{f}' is defined more than once in struct '{S}' |
second field |
| E0902 | Duplicate function parameter | parameter '{x}' is defined more than once in function '{f}' |
second parameter |
12.10 Binding errors (E10xx)
| Code | Category | Message template | Span(s) |
|---|---|---|---|
| E1000 | No type for binding | cannot infer type for '{x}': no annotation and no initialiser |
let keyword |
| E1001 | Missing return | function '{f}' must return '{R}' but not all paths return a value |
function name |
12.11 Warnings (Wxxx)
| Code | Category | Message template | Span(s) |
|---|---|---|---|
| W001 | Unreachable code | unreachable statement |
first unreachable statement |