Files
flux/SYNTAX.md
Jooris Hadeler 73e36fac71 Initial Flux language specification
Add the LL(1) context-free grammar (GRAMMAR.ebnf), token and syntax
reference (SYNTAX.md), LL(1) verification tool (ll1_check.py), and a
fibonacci example demonstrating the language.
2026-03-10 14:41:54 +01:00

24 KiB
Raw Blame History

Flux Language Syntax Reference

Lexical Tokens

All tokens listed here are produced by the lexer (lexical analysis phase) and appear as UPPERCASE terminals in GRAMMAR.ebnf.

Literals

Token Description Examples
INT_LIT Integer literal (decimal, hex 0x, octal 0o, binary 0b) 42, 0xFF, 0o77, 0b1010
FLOAT_LIT Floating-point literal 3.14, 1.0e-9, 0.5
STRING_LIT Double-quoted UTF-8 string, supports \n \t \\ \" escape sequences "hello\nworld"
CHAR_LIT Single-quoted Unicode scalar value 'a', '\n', '\u{1F600}'
TRUE Boolean true literal true
FALSE Boolean false literal false

Identifier

Token Description
IDENT Identifier: starts with a letter or _, followed by letters, digits, or _. Unicode letters are permitted.

Operator Tokens

Token Lexeme Description
PLUS + Addition / unary plus (not in grammar)
MINUS - Subtraction / unary negation
STAR * Multiplication / pointer dereference
SLASH / Division
PERCENT % Modulo (remainder)
AMP & Bitwise AND / address-of
PIPE | Bitwise OR
CARET ^ Bitwise XOR
BANG ! Logical NOT
TILDE ~ Bitwise NOT
DOT . Member access

Keyword Tokens

Operator Keywords

Lexeme Description
and Logical AND
or Logical OR

Boolean Literals

Lexeme Description
true Boolean true value
false Boolean false value

Primitive Type Keywords

Lexeme Description
u8 Unsigned 8-bit integer
u16 Unsigned 16-bit integer
u32 Unsigned 32-bit integer
u64 Unsigned 64-bit integer
i8 Signed 8-bit integer
i16 Signed 16-bit integer
i32 Signed 32-bit integer
i64 Signed 64-bit integer
f32 32-bit IEEE 754 floating-point
f64 64-bit IEEE 754 floating-point
bool Boolean (true or false)
char Unicode scalar value (32-bit)

Pointer Keyword

Lexeme Description
opaque Used in *opaque to denote a pointer with no type info

Statement Keywords

Lexeme Description
let Introduces a variable binding
mut Marks a binding or pointer as mutable
return Exits the enclosing function
if Conditional statement
else Alternative branch of an if
while Condition-controlled loop
loop Infinite loop
break Exit the immediately enclosing loop
continue Skip to the next iteration of a loop

Definition Keywords

Lexeme Description
fn Introduces a function definition
struct Introduces a struct definition

Lexer note: All keywords above are reserved and must be recognised before the general IDENT rule. An identifier may not shadow any keyword.

Delimiter / Punctuation Tokens

Token Lexeme Description
LPAREN ( Left parenthesis
RPAREN ) Right parenthesis
LBRACKET [ Left square bracket
RBRACKET ] Right square bracket
COMMA , Argument / element separator
SEMICOLON ; Statement terminator / array size separator ([T; N])
LCURLY { Block / compound expression open
RCURLY } Block / compound expression close
ARROW -> Function return type separator
COLON : Type annotation separator

Expressions

Expressions produce a value. The grammar defines them through a hierarchy of precedence levels — lower in the list means lower precedence (binds less tightly).

Operator Precedence Table

Level Operators Associativity Description
1 or left Logical OR (lowest)
2 and left Logical AND
3 | left Bitwise OR
4 ^ left Bitwise XOR
5 & left Bitwise AND
6 + - left Addition, subtraction
7 * / % left Multiplication, division, modulo
8 ! ~ - * & right (unary) Prefix unary operators
9 . […] (…) left (postfix) Member access, index, call
10 literals, identifiers, () Primary expressions (highest)

Operator Descriptions

Binary Operators

Operator Name Example Notes
or Logical OR a or b Short-circuits; both operands must be bool
and Logical AND a and b Short-circuits; both operands must be bool
| Bitwise OR a | b Integer types
^ Bitwise XOR a ^ b Integer types
& Bitwise AND a & b Integer types (binary context)
+ Addition a + b
- Subtraction a - b
* Multiplication a * b Binary context (both operands are values)
/ Division a / b Integer division truncates toward zero
% Modulo a % b Sign follows the dividend

Unary Prefix Operators

Operator Name Example Notes
! Logical NOT !cond Operand must be bool
~ Bitwise NOT ~mask Bitwise complement; integer types
- Negation -x Arithmetic negation
* Dereference *ptr Unary context; operand must be a pointer type
& Address-of &x Unary context; produces a pointer to the operand

Postfix Operators

Operator Name Example Notes
. Member access obj.field Accesses a named field or method of a struct/type
[…] Subscript arr[i] Indexes into an array, slice, or map
(…) Call f(a, b) Invokes a function or closure

Disambiguation: * and & are context-sensitive. When appearing as the first token of a unary_expr they are unary (dereference / address-of). When appearing between two unary_expr sub-trees inside multiplicative_expr or bitand_expr they are binary (multiplication / bitwise AND). The parser resolves this purely from grammatical position — no look-ahead beyond 1 token is required.

Parenthesised Expressions

Any expression may be wrapped in parentheses to override default precedence:

(a + b) * c

Function Call Argument List

Arguments are comma-separated expressions. A trailing comma is not permitted at this grammar level.

f()
f(x)
f(x, y, z)

Examples

// Arithmetic
a + b * c - d % 2

// Bitwise
flags & MASK | extra ^ toggle

// Logical
ready and not_done or fallback

// Mixed unary / postfix
*ptr.field
&arr[i]
!cond

// Chained postfix
obj.method(arg1, arg2)[0].name

// Explicit precedence override
(a or b) and c

Types

Types describe the shape and interpretation of values. All type positions in the grammar reference the type non-terminal.

Primitive Types

Primitive types are single-keyword types built into the language.

Type Kind Width Range / Notes
u8 Unsigned integer 8-bit 0 … 255
u16 Unsigned integer 16-bit 0 … 65 535
u32 Unsigned integer 32-bit 0 … 4 294 967 295
u64 Unsigned integer 64-bit 0 … 2⁶⁴ 1
i8 Signed integer 8-bit 128 … 127
i16 Signed integer 16-bit 32 768 … 32 767
i32 Signed integer 32-bit 2 147 483 648 … 2 147 483 647
i64 Signed integer 64-bit 2⁶³ … 2⁶³ 1
f32 Floating-point 32-bit IEEE 754 single precision
f64 Floating-point 64-bit IEEE 754 double precision
bool Boolean 1 byte true or false
char Unicode scalar 32-bit Any Unicode scalar value (not a surrogate)

Named Types

A named type is any user-defined type referenced by its identifier — typically a struct name. Because all primitive-type keywords (u8, bool, etc.) are reserved, an IDENT in type position is always a named type, never a primitive.

Point        // struct Point { x: f32, y: f32 }
Node         // struct Node { value: i64, next: *Node }
*Point       // pointer to a named type
[Node; 8]    // array of a named type

Pointer Types

A pointer type is written with a leading *.

Syntax Description
*T Typed pointer — points to a value of type T
*opaque Opaque pointer — no compile-time pointee type information; equivalent to C's void *

Pointer types may be nested: **u8 is a pointer to a pointer to u8.

*u8          // pointer to u8
**i32        // pointer to pointer to i32
*opaque      // untyped pointer
**opaque     // pointer to untyped pointer

Array Types

Arrays have a fixed size known at compile time.

[ <element-type> ; <size> ]

<size> must be a non-negative integer literal (INT_LIT). The element type may itself be any type, including pointers or nested arrays.

[u8; 256]          // array of 256 u8 values
[*u8; 4]           // array of 4 pointers to u8
[[f32; 3]; 3]      // 3×3 matrix of f32 (array of arrays)
[*opaque; 8]       // array of 8 opaque pointers

Type Grammar Summary

type           = primitive_type | named_type | pointer_type | array_type ;
primitive_type = "u8" | "u16" | "u32" | "u64"
               | "i8" | "i16" | "i32" | "i64"
               | "f32" | "f64" | "bool" | "char" ;
named_type     = IDENT ;
pointer_type   = "*" , ( "opaque" | type ) ;
array_type     = "[" , type , ";" , INT_LIT , "]" ;

Struct Literals

A struct literal constructs a value of a named struct type by providing values for each field.

<TypeName> { <field>: <expr>, ... }

Fields may appear in any order and need not match the declaration order. No trailing comma is permitted.

Examples

let p = Point { x: 1.0, y: 2.0 };

let n = Node {
    value: 42,
    next: get_next()
};

// Nested struct literal
let outer = Rect {
    origin: Point { x: 0.0, y: 0.0 },
    size: Point { x: 10.0, y: 5.0 }
};

// Empty struct
let u = Unit {};

Struct Literals in Conditions

Struct literals are not permitted as the outermost expression in if and while conditions. This restriction exists because { after the condition is ambiguous — it could start a struct literal body or the statement block.

// ERROR — ambiguous: is `{` a struct body or the if block?
if Flags { verbose: true } { ... }

// OK — parentheses resolve the ambiguity
if (Flags { verbose: true }).verbose { ... }

The grammar enforces this through the expr_ns (no-struct) hierarchy used in condition positions. Struct literals remain valid everywhere else: let, return, function arguments, field values, etc.

Struct Literal Grammar Summary

primary_expr      = IDENT , [ struct_lit_body ] | INT_LIT | FLOAT_LIT
                  | STRING_LIT | CHAR_LIT | "true" | "false"
                  | "(" , expr , ")" ;
struct_lit_body   = "{" , struct_field_list , "}" ;
struct_field_list = [ struct_field , { "," , struct_field } ] ;
struct_field      = IDENT , ":" , expr ;

No-Struct Expression (expr_ns)

expr_ns is a parallel expression hierarchy identical to expr except its primary level (primary_expr_ns) does not allow the struct_lit_body suffix after an IDENT. Struct literals are still permitted when enclosed in parentheses ("(" , expr , ")"), because the ( unambiguously marks the start of a grouped expression.

if_stmt and while_stmt use expr_ns for their condition; all other expression positions use the full expr.


Statements

Statements perform an action and do not produce a value. Each statement is terminated by a semicolon ;.

Let Statement

Introduces a new named binding in the current scope.

let [mut] <name> [: <type>] [= <expr>] ;
Part Required Description
mut no Makes the binding mutable; omit for immutable
<name> yes The identifier being bound
: <type> no Explicit type annotation
= <expr> no Initialiser expression
; yes Statement terminator

Bindings are immutable by default. Attempting to assign to a binding declared without mut is a compile-time error.

At least one of the type annotation or the initialiser must be present so the compiler can determine the binding's type. This is a semantic constraint, not a syntactic one — the grammar permits bare let x; and the type checker rejects it if no type can be inferred from context.

Examples

// Immutable, type inferred from initialiser
let x = 42;

// Immutable, explicit type
let y: f64 = 3.14;

// Mutable, type inferred
let mut count = 0;

// Mutable, explicit type, no initialiser (must be assigned before use)
let mut buf: [u8; 128];

// Mutable pointer to u32
let mut ptr: *u32 = &value;

// Shadowing a previous binding is allowed
let x = "hello";   // x is now a string, previous x is gone

Return Statement

Exits the enclosing function immediately, optionally producing a return value.

return [<expr>] ;

return; (no expression) is used when the function's return type is the unit type (). return <expr>; returns the value of the expression.

Explicit return is only needed for early exits. The idiomatic way to return a value from a function is the implicit return of its body block.

return;               // unit return
return 42;            // return an integer
return x * 2 + 1;    // return an expression

Expression Statement

Evaluates an expression for its side effects; the resulting value is discarded. A semicolon is required.

<expr> ;
do_something(x);    // call for side effects
count + 1;          // legal but silly — value discarded

Statement Grammar Summary

stmt          = let_stmt | return_stmt | if_stmt
              | while_stmt | loop_stmt | break_stmt | continue_stmt
              | block_stmt | expr_stmt ;
let_stmt      = "let" , [ "mut" ] , IDENT , [ ":" , type ] , [ "=" , expr ] , ";" ;
return_stmt   = "return" , [ expr ] , ";" ;
if_stmt       = "if" , expr_ns , block_stmt , [ "else" , else_branch ] ;
else_branch   = if_stmt | block_stmt ;
while_stmt    = "while" , expr_ns , block_stmt ;
loop_stmt     = "loop" , block_stmt ;
break_stmt    = "break" , ";" ;
continue_stmt = "continue" , ";" ;
block_stmt    = "{" , { stmt } , "}" ;
expr_stmt     = expr , ";" ;

If Statement

Conditionally executes a block based on a boolean expression.

if <cond> <block> [else <else-branch>]

The condition <cond> must be an expression of type bool. The body is always a block_stmt — braces are mandatory.

Else Branch

The optional else branch is either a plain block or another if statement, enabling else if chains of arbitrary length.

if x > 0 {
    pos();
}

if x > 0 {
    pos();
} else {
    non_pos();
}

if x > 0 {
    pos();
} else if x < 0 {
    neg();
} else {
    zero();
}

If Statement Grammar Summary

if_stmt     = "if" , expr_ns , block_stmt , [ "else" , else_branch ] ;
else_branch = if_stmt | block_stmt ;

While Loop

Repeatedly executes a block as long as a boolean condition holds. The condition is tested before each iteration; if it is false on entry, the body never runs.

while <cond> <block>
let mut i = 0;
while i < 10 {
    process(i);
    i = i + 1;
}

While Loop Grammar Summary

while_stmt = "while" , expr_ns , block_stmt ;

Loop

Executes a block unconditionally and indefinitely. The loop runs until a break or return inside the body transfers control out.

loop <block>
loop {
    let msg = recv();
    if msg.is_quit() {
        break;
    }
    handle(msg);
}

Loop Grammar Summary

loop_stmt = "loop" , block_stmt ;

Break and Continue

break and continue are only valid inside the body of a while or loop. The compiler enforces this as a semantic rule.

Statement Effect
break ; Exits the immediately enclosing loop immediately
continue ; Skips the rest of the current iteration; jumps to the next one

For while, continue jumps back to the condition check. For loop, continue jumps back to the top of the body.

let mut i = 0;
while i < 20 {
    i = i + 1;
    if i % 2 == 0 {
        continue;   // skip even numbers
    }
    if i > 15 {
        break;      // stop after 15
    }
    process(i);
}

Break / Continue Grammar Summary

break_stmt    = "break" , ";" ;
continue_stmt = "continue" , ";" ;

Block Statement

A block groups zero or more statements into a single statement and introduces a new lexical scope. Blocks do not produce a value.

{ <stmt>* }

Scoping

Bindings declared inside a block are not visible outside it. A binding in an inner scope may shadow a name from an outer scope without affecting it.

let x = 1;
{
    let x = 2;   // shadows outer x inside this block only
    f(x);        // uses 2
}
// x is still 1 here

Nesting

Blocks may be nested freely to any depth.

{
    let a = compute_a();
    {
        let b = compute_b();
        use(a, b);
    }
    // b is no longer in scope here
}

Block Grammar Summary

block = "{" , { stmt } , "}" ;

Top-Level Definitions

A Flux source file is a sequence of top-level definitions.

program       = { top_level_def } ;
top_level_def = func_def | struct_def ;

The leading token unambiguously selects the definition kind: fn → function, struct → struct.


Function Definition

Defines a named, callable function.

fn <name> ( [<params>] ) [-> <return-type>] <block>
Part Required Description
<name> yes The function's identifier
( [<params>] ) yes Comma-separated parameter list, may be empty
-> <return-type> no Return type; omitting it means the function returns ()
<block> yes Function body — a block_stmt

Parameters

Each parameter is a name with a mandatory type annotation. Parameters are immutable by default; mut makes the local binding mutable within the body.

[mut] <name> : <type>
fn add(a: i32, b: i32) -> i32 {
    return a + b;
}

fn greet(name: *u8) {
    print(name);
}

fn increment(mut x: i32) -> i32 {
    x = x + 1;
    return x;
}

fn apply(f: *opaque, mut buf: [u8; 64]) -> bool {
    return call(f, &buf);
}

Return Type

If -> is omitted the return type is implicitly () (the unit type). An explicit -> () is also permitted but redundant.

fn do_work() {          // returns ()
    side_effect();
}

fn get_value() -> i64 { // returns i64
    return 42;
}

Function Definition Grammar Summary

func_def   = "fn" , IDENT , "(" , param_list , ")" , [ "->" , type ] , block_stmt ;
param_list = [ param , { "," , param } ] ;
param      = [ "mut" ] , IDENT , ":" , type ;

Struct Definition

Defines a named product type with zero or more typed fields.

struct <name> {
    <field>: <type>,
    ...
}

Fields are separated by commas. No trailing comma is permitted. An empty struct (zero fields) is valid.

Fields

Each field is a name and a type. Fields may be of any type including pointers, arrays, and other structs. Field names must be unique within the struct.

struct Point {
    x: f32,
    y: f32
}

struct Node {
    value: i64,
    next: *Node
}

struct Buffer {
    data: *u8,
    len: u64,
    cap: u64
}

struct Unit {}

Member Access

Fields of a struct value are accessed with the . operator (defined in the expression grammar). If the value is behind a pointer, dereference it first with *.

let p: Point = make_point();
let x = p.x;

let ptr: *Point = get_point_ptr();
let y = (*ptr).y;

Struct Definition Grammar Summary

struct_def = "struct" , IDENT , "{" , field_list , "}" ;
field_list = [ field , { "," , field } ] ;
field      = IDENT , ":" , type ;