Files

Jooris Hadeler 73e36fac71 Initial Flux language specification

Add the LL(1) context-free grammar (GRAMMAR.ebnf), token and syntax
reference (SYNTAX.md), LL(1) verification tool (ll1_check.py), and a
fibonacci example demonstrating the language.

2026-03-10 14:41:54 +01:00

24 KiB

Raw Blame History

Flux Language Syntax Reference

Lexical Tokens

All tokens listed here are produced by the lexer (lexical analysis phase) and appear as UPPERCASE terminals in GRAMMAR.ebnf.

Literals

Token	Description	Examples
`INT_LIT`	Integer literal (decimal, hex `0x`, octal `0o`, binary `0b`)	`42`, `0xFF`, `0o77`, `0b1010`
`FLOAT_LIT`	Floating-point literal	`3.14`, `1.0e-9`, `0.5`
`STRING_LIT`	Double-quoted UTF-8 string, supports `\n \t \\ \"` escape sequences	`"hello\nworld"`
`CHAR_LIT`	Single-quoted Unicode scalar value	`'a'`, `'\n'`, `'\u{1F600}'`
`TRUE`	Boolean true literal	`true`
`FALSE`	Boolean false literal	`false`

Identifier

Token	Description
`IDENT`	Identifier: starts with a letter or `_`, followed by letters, digits, or `_`. Unicode letters are permitted.

Operator Tokens

Token	Lexeme	Description
`PLUS`	`+`	Addition / unary plus (not in grammar)
`MINUS`	`-`	Subtraction / unary negation
`STAR`	`*`	Multiplication / pointer dereference
`SLASH`	`/`	Division
`PERCENT`	`%`	Modulo (remainder)
`AMP`	`&`	Bitwise AND / address-of
`PIPE`	`\|`	Bitwise OR
`CARET`	`^`	Bitwise XOR
`BANG`	`!`	Logical NOT
`TILDE`	`~`	Bitwise NOT
`DOT`	`.`	Member access

Keyword Tokens

Operator Keywords

Lexeme	Description
`and`	Logical AND
`or`	Logical OR

Boolean Literals

Lexeme	Description
`true`	Boolean true value
`false`	Boolean false value

Primitive Type Keywords

Lexeme	Description
`u8`	Unsigned 8-bit integer
`u16`	Unsigned 16-bit integer
`u32`	Unsigned 32-bit integer
`u64`	Unsigned 64-bit integer
`i8`	Signed 8-bit integer
`i16`	Signed 16-bit integer
`i32`	Signed 32-bit integer
`i64`	Signed 64-bit integer
`f32`	32-bit IEEE 754 floating-point
`f64`	64-bit IEEE 754 floating-point
`bool`	Boolean (`true` or `false`)
`char`	Unicode scalar value (32-bit)

Pointer Keyword

Lexeme	Description
`opaque`	Used in `*opaque` to denote a pointer with no type info

Statement Keywords

Lexeme	Description
`let`	Introduces a variable binding
`mut`	Marks a binding or pointer as mutable
`return`	Exits the enclosing function
`if`	Conditional statement
`else`	Alternative branch of an `if`
`while`	Condition-controlled loop
`loop`	Infinite loop
`break`	Exit the immediately enclosing loop
`continue`	Skip to the next iteration of a loop

Definition Keywords

Lexeme	Description
`fn`	Introduces a function definition
`struct`	Introduces a struct definition

Lexer note: All keywords above are reserved and must be recognised before the general IDENT rule. An identifier may not shadow any keyword.

Delimiter / Punctuation Tokens

Token	Lexeme	Description
`LPAREN`	`(`	Left parenthesis
`RPAREN`	`)`	Right parenthesis
`LBRACKET`	`[`	Left square bracket
`RBRACKET`	`]`	Right square bracket
`COMMA`	`,`	Argument / element separator
`SEMICOLON`	`;`	Statement terminator / array size separator (`[T; N]`)
`LCURLY`	`{`	Block / compound expression open
`RCURLY`	`}`	Block / compound expression close
`ARROW`	`->`	Function return type separator
`COLON`	`:`	Type annotation separator

Expressions

Expressions produce a value. The grammar defines them through a hierarchy of precedence levels — lower in the list means lower precedence (binds less tightly).

Operator Precedence Table

Level	Operators	Associativity	Description
1	`or`	left	Logical OR (lowest)
2	`and`	left	Logical AND
3	`\|`	left	Bitwise OR
4	`^`	left	Bitwise XOR
5	`&`	left	Bitwise AND
6	`+` `-`	left	Addition, subtraction
7	`*` `/` `%`	left	Multiplication, division, modulo
8	`!` `~` `-` `*` `&`	right (unary)	Prefix unary operators
9	`.` `[…]` `(…)`	left (postfix)	Member access, index, call
10	literals, identifiers, `()`	—	Primary expressions (highest)

Operator Descriptions

Binary Operators

Operator	Name	Example	Notes
`or`	Logical OR	`a or b`	Short-circuits; both operands must be `bool`
`and`	Logical AND	`a and b`	Short-circuits; both operands must be `bool`
`\|`	Bitwise OR	`a \| b`	Integer types
`^`	Bitwise XOR	`a ^ b`	Integer types
`&`	Bitwise AND	`a & b`	Integer types (binary context)
`+`	Addition	`a + b`
`-`	Subtraction	`a - b`
`*`	Multiplication	`a * b`	Binary context (both operands are values)
`/`	Division	`a / b`	Integer division truncates toward zero
`%`	Modulo	`a % b`	Sign follows the dividend

Unary Prefix Operators

Operator	Name	Example	Notes
`!`	Logical NOT	`!cond`	Operand must be `bool`
`~`	Bitwise NOT	`~mask`	Bitwise complement; integer types
`-`	Negation	`-x`	Arithmetic negation
`*`	Dereference	`*ptr`	Unary context; operand must be a pointer type
`&`	Address-of	`&x`	Unary context; produces a pointer to the operand

Postfix Operators

Operator	Name	Example	Notes
`.`	Member access	`obj.field`	Accesses a named field or method of a struct/type
`[…]`	Subscript	`arr[i]`	Indexes into an array, slice, or map
`(…)`	Call	`f(a, b)`	Invokes a function or closure

Disambiguation: * and & are context-sensitive. When appearing as the first token of a unary_expr they are unary (dereference / address-of). When appearing between two unary_expr sub-trees inside multiplicative_expr or bitand_expr they are binary (multiplication / bitwise AND). The parser resolves this purely from grammatical position — no look-ahead beyond 1 token is required.

Parenthesised Expressions

Any expression may be wrapped in parentheses to override default precedence:

(a + b) * c

Function Call Argument List

Arguments are comma-separated expressions. A trailing comma is not permitted at this grammar level.

f()
f(x)
f(x, y, z)

Examples

// Arithmetic
a + b * c - d % 2

// Bitwise
flags & MASK | extra ^ toggle

// Logical
ready and not_done or fallback

// Mixed unary / postfix
*ptr.field
&arr[i]
!cond

// Chained postfix
obj.method(arg1, arg2)[0].name

// Explicit precedence override
(a or b) and c

Types

Types describe the shape and interpretation of values. All type positions in the grammar reference the type non-terminal.

Primitive Types

Primitive types are single-keyword types built into the language.

Type	Kind	Width	Range / Notes
`u8`	Unsigned integer	8-bit	0 … 255
`u16`	Unsigned integer	16-bit	0 … 65 535
`u32`	Unsigned integer	32-bit	0 … 4 294 967 295
`u64`	Unsigned integer	64-bit	0 … 2⁶⁴ − 1
`i8`	Signed integer	8-bit	−128 … 127
`i16`	Signed integer	16-bit	−32 768 … 32 767
`i32`	Signed integer	32-bit	−2 147 483 648 … 2 147 483 647
`i64`	Signed integer	64-bit	−2⁶³ … 2⁶³ − 1
`f32`	Floating-point	32-bit	IEEE 754 single precision
`f64`	Floating-point	64-bit	IEEE 754 double precision
`bool`	Boolean	1 byte	`true` or `false`
`char`	Unicode scalar	32-bit	Any Unicode scalar value (not a surrogate)

Named Types

A named type is any user-defined type referenced by its identifier — typically a struct name. Because all primitive-type keywords (u8, bool, etc.) are reserved, an IDENT in type position is always a named type, never a primitive.

Point        // struct Point { x: f32, y: f32 }
Node         // struct Node { value: i64, next: *Node }
*Point       // pointer to a named type
[Node; 8]    // array of a named type

Pointer Types

A pointer type is written with a leading *.

Syntax	Description
`*T`	Typed pointer — points to a value of type `T`
`*opaque`	Opaque pointer — no compile-time pointee type information; equivalent to C's `void *`

Pointer types may be nested: **u8 is a pointer to a pointer to u8.

*u8          // pointer to u8
**i32        // pointer to pointer to i32
*opaque      // untyped pointer
**opaque     // pointer to untyped pointer

Array Types

Arrays have a fixed size known at compile time.

[ <element-type> ; <size> ]

<size> must be a non-negative integer literal (INT_LIT). The element type may itself be any type, including pointers or nested arrays.

[u8; 256]          // array of 256 u8 values
[*u8; 4]           // array of 4 pointers to u8
[[f32; 3]; 3]      // 3×3 matrix of f32 (array of arrays)
[*opaque; 8]       // array of 8 opaque pointers

Type Grammar Summary

type           = primitive_type | named_type | pointer_type | array_type ;
primitive_type = "u8" | "u16" | "u32" | "u64"
               | "i8" | "i16" | "i32" | "i64"
               | "f32" | "f64" | "bool" | "char" ;
named_type     = IDENT ;
pointer_type   = "*" , ( "opaque" | type ) ;
array_type     = "[" , type , ";" , INT_LIT , "]" ;

Struct Literals

A struct literal constructs a value of a named struct type by providing values for each field.

<TypeName> { <field>: <expr>, ... }

Fields may appear in any order and need not match the declaration order. No trailing comma is permitted.

Examples

let p = Point { x: 1.0, y: 2.0 };

let n = Node {
    value: 42,
    next: get_next()
};

// Nested struct literal
let outer = Rect {
    origin: Point { x: 0.0, y: 0.0 },
    size: Point { x: 10.0, y: 5.0 }
};

// Empty struct
let u = Unit {};

Struct Literals in Conditions

Struct literals are not permitted as the outermost expression in if and while conditions. This restriction exists because { after the condition is ambiguous — it could start a struct literal body or the statement block.

// ERROR — ambiguous: is `{` a struct body or the if block?
if Flags { verbose: true } { ... }

// OK — parentheses resolve the ambiguity
if (Flags { verbose: true }).verbose { ... }

The grammar enforces this through the expr_ns (no-struct) hierarchy used in condition positions. Struct literals remain valid everywhere else: let, return, function arguments, field values, etc.

Struct Literal Grammar Summary

primary_expr      = IDENT , [ struct_lit_body ] | INT_LIT | FLOAT_LIT
                  | STRING_LIT | CHAR_LIT | "true" | "false"
                  | "(" , expr , ")" ;
struct_lit_body   = "{" , struct_field_list , "}" ;
struct_field_list = [ struct_field , { "," , struct_field } ] ;
struct_field      = IDENT , ":" , expr ;

No-Struct Expression (`expr_ns`)

expr_ns is a parallel expression hierarchy identical to expr except its primary level (primary_expr_ns) does not allow the struct_lit_body suffix after an IDENT. Struct literals are still permitted when enclosed in parentheses ("(" , expr , ")"), because the ( unambiguously marks the start of a grouped expression.

if_stmt and while_stmt use expr_ns for their condition; all other expression positions use the full expr.

Statements

Statements perform an action and do not produce a value. Each statement is terminated by a semicolon ;.

Let Statement

Introduces a new named binding in the current scope.

let [mut] <name> [: <type>] [= <expr>] ;

Part	Required	Description
`mut`	no	Makes the binding mutable; omit for immutable
`<name>`	yes	The identifier being bound
`: <type>`	no	Explicit type annotation
`= <expr>`	no	Initialiser expression
`;`	yes	Statement terminator

Bindings are immutable by default. Attempting to assign to a binding declared without mut is a compile-time error.

At least one of the type annotation or the initialiser must be present so the compiler can determine the binding's type. This is a semantic constraint, not a syntactic one — the grammar permits bare let x; and the type checker rejects it if no type can be inferred from context.

Examples

// Immutable, type inferred from initialiser
let x = 42;

// Immutable, explicit type
let y: f64 = 3.14;

// Mutable, type inferred
let mut count = 0;

// Mutable, explicit type, no initialiser (must be assigned before use)
let mut buf: [u8; 128];

// Mutable pointer to u32
let mut ptr: *u32 = &value;

// Shadowing a previous binding is allowed
let x = "hello";   // x is now a string, previous x is gone

Return Statement

Exits the enclosing function immediately, optionally producing a return value.

return [<expr>] ;

return; (no expression) is used when the function's return type is the unit type (). return <expr>; returns the value of the expression.

Explicit return is only needed for early exits. The idiomatic way to return a value from a function is the implicit return of its body block.

return;               // unit return
return 42;            // return an integer
return x * 2 + 1;    // return an expression

Expression Statement

Evaluates an expression for its side effects; the resulting value is discarded. A semicolon is required.

<expr> ;

do_something(x);    // call for side effects
count + 1;          // legal but silly — value discarded

Statement Grammar Summary

stmt          = let_stmt | return_stmt | if_stmt
              | while_stmt | loop_stmt | break_stmt | continue_stmt
              | block_stmt | expr_stmt ;
let_stmt      = "let" , [ "mut" ] , IDENT , [ ":" , type ] , [ "=" , expr ] , ";" ;
return_stmt   = "return" , [ expr ] , ";" ;
if_stmt       = "if" , expr_ns , block_stmt , [ "else" , else_branch ] ;
else_branch   = if_stmt | block_stmt ;
while_stmt    = "while" , expr_ns , block_stmt ;
loop_stmt     = "loop" , block_stmt ;
break_stmt    = "break" , ";" ;
continue_stmt = "continue" , ";" ;
block_stmt    = "{" , { stmt } , "}" ;
expr_stmt     = expr , ";" ;

If Statement

Conditionally executes a block based on a boolean expression.

if <cond> <block> [else <else-branch>]

The condition <cond> must be an expression of type bool. The body is always a block_stmt — braces are mandatory.

Else Branch

The optional else branch is either a plain block or another if statement, enabling else if chains of arbitrary length.

if x > 0 {
    pos();
}

if x > 0 {
    pos();
} else {
    non_pos();
}

if x > 0 {
    pos();
} else if x < 0 {
    neg();
} else {
    zero();
}

If Statement Grammar Summary

if_stmt     = "if" , expr_ns , block_stmt , [ "else" , else_branch ] ;
else_branch = if_stmt | block_stmt ;

While Loop

Repeatedly executes a block as long as a boolean condition holds. The condition is tested before each iteration; if it is false on entry, the body never runs.

while <cond> <block>

let mut i = 0;
while i < 10 {
    process(i);
    i = i + 1;
}

While Loop Grammar Summary

while_stmt = "while" , expr_ns , block_stmt ;

Loop

Executes a block unconditionally and indefinitely. The loop runs until a break or return inside the body transfers control out.

loop <block>

loop {
    let msg = recv();
    if msg.is_quit() {
        break;
    }
    handle(msg);
}

Loop Grammar Summary

loop_stmt = "loop" , block_stmt ;

Break and Continue

break and continue are only valid inside the body of a while or loop. The compiler enforces this as a semantic rule.

Statement	Effect
`break ;`	Exits the immediately enclosing loop immediately
`continue ;`	Skips the rest of the current iteration; jumps to the next one

For while, continue jumps back to the condition check. For loop, continue jumps back to the top of the body.

let mut i = 0;
while i < 20 {
    i = i + 1;
    if i % 2 == 0 {
        continue;   // skip even numbers
    }
    if i > 15 {
        break;      // stop after 15
    }
    process(i);
}

Break / Continue Grammar Summary

break_stmt    = "break" , ";" ;
continue_stmt = "continue" , ";" ;

Block Statement

A block groups zero or more statements into a single statement and introduces a new lexical scope. Blocks do not produce a value.

{ <stmt>* }

Scoping

Bindings declared inside a block are not visible outside it. A binding in an inner scope may shadow a name from an outer scope without affecting it.

let x = 1;
{
    let x = 2;   // shadows outer x inside this block only
    f(x);        // uses 2
}
// x is still 1 here

Nesting

Blocks may be nested freely to any depth.

{
    let a = compute_a();
    {
        let b = compute_b();
        use(a, b);
    }
    // b is no longer in scope here
}

Block Grammar Summary

block = "{" , { stmt } , "}" ;

Top-Level Definitions

A Flux source file is a sequence of top-level definitions.

program       = { top_level_def } ;
top_level_def = func_def | struct_def ;

The leading token unambiguously selects the definition kind: fn → function, struct → struct.

Function Definition

Defines a named, callable function.

fn <name> ( [<params>] ) [-> <return-type>] <block>

Part	Required	Description
`<name>`	yes	The function's identifier
`( [<params>] )`	yes	Comma-separated parameter list, may be empty
`-> <return-type>`	no	Return type; omitting it means the function returns `()`
`<block>`	yes	Function body — a `block_stmt`

Parameters

Each parameter is a name with a mandatory type annotation. Parameters are immutable by default; mut makes the local binding mutable within the body.

[mut] <name> : <type>

fn add(a: i32, b: i32) -> i32 {
    return a + b;
}

fn greet(name: *u8) {
    print(name);
}

fn increment(mut x: i32) -> i32 {
    x = x + 1;
    return x;
}

fn apply(f: *opaque, mut buf: [u8; 64]) -> bool {
    return call(f, &buf);
}

Return Type

If -> is omitted the return type is implicitly () (the unit type). An explicit -> () is also permitted but redundant.

fn do_work() {          // returns ()
    side_effect();
}

fn get_value() -> i64 { // returns i64
    return 42;
}

Function Definition Grammar Summary

func_def   = "fn" , IDENT , "(" , param_list , ")" , [ "->" , type ] , block_stmt ;
param_list = [ param , { "," , param } ] ;
param      = [ "mut" ] , IDENT , ":" , type ;

Struct Definition

Defines a named product type with zero or more typed fields.

struct <name> {
    <field>: <type>,
    ...
}

Fields are separated by commas. No trailing comma is permitted. An empty struct (zero fields) is valid.

Fields

Each field is a name and a type. Fields may be of any type including pointers, arrays, and other structs. Field names must be unique within the struct.

struct Point {
    x: f32,
    y: f32
}

struct Node {
    value: i64,
    next: *Node
}

struct Buffer {
    data: *u8,
    len: u64,
    cap: u64
}

struct Unit {}

Member Access

Fields of a struct value are accessed with the . operator (defined in the expression grammar). If the value is behind a pointer, dereference it first with *.

let p: Point = make_point();
let x = p.x;

let ptr: *Point = get_point_ptr();
let y = (*ptr).y;

Struct Definition Grammar Summary

struct_def = "struct" , IDENT , "{" , field_list , "}" ;
field_list = [ field , { "," , field } ] ;
field      = IDENT , ":" , type ;

24 KiB Raw Blame History Unescape Escape

Flux Language Syntax Reference

Lexical Tokens

Literals

Identifier

Operator Tokens

Keyword Tokens

Operator Keywords

Boolean Literals

Primitive Type Keywords

Pointer Keyword

Statement Keywords

Definition Keywords

Delimiter / Punctuation Tokens

Expressions

Operator Precedence Table

Operator Descriptions

Binary Operators

Unary Prefix Operators

Postfix Operators

Parenthesised Expressions

Function Call Argument List

Examples

Types

Primitive Types

Named Types

Pointer Types

Array Types

Type Grammar Summary

Struct Literals

Examples

Struct Literals in Conditions

Struct Literal Grammar Summary

No-Struct Expression (expr_ns)

Statements

Let Statement

Examples

Return Statement

Expression Statement

Statement Grammar Summary

If Statement

Else Branch

If Statement Grammar Summary

While Loop

While Loop Grammar Summary

Loop

Loop Grammar Summary

Break and Continue

Break / Continue Grammar Summary

Block Statement

Scoping

Nesting

Block Grammar Summary

Top-Level Definitions

Function Definition

Parameters

Return Type

Function Definition Grammar Summary

Struct Definition

Fields

Member Access

Struct Definition Grammar Summary

24 KiB

Raw Blame History

No-Struct Expression (`expr_ns`)