Add the LL(1) context-free grammar (GRAMMAR.ebnf), token and syntax reference (SYNTAX.md), LL(1) verification tool (ll1_check.py), and a fibonacci example demonstrating the language.
24 KiB
Flux Language Syntax Reference
Lexical Tokens
All tokens listed here are produced by the lexer (lexical analysis phase) and
appear as UPPERCASE terminals in GRAMMAR.ebnf.
Literals
| Token | Description | Examples |
|---|---|---|
INT_LIT |
Integer literal (decimal, hex 0x, octal 0o, binary 0b) |
42, 0xFF, 0o77, 0b1010 |
FLOAT_LIT |
Floating-point literal | 3.14, 1.0e-9, 0.5 |
STRING_LIT |
Double-quoted UTF-8 string, supports \n \t \\ \" escape sequences |
"hello\nworld" |
CHAR_LIT |
Single-quoted Unicode scalar value | 'a', '\n', '\u{1F600}' |
TRUE |
Boolean true literal | true |
FALSE |
Boolean false literal | false |
Identifier
| Token | Description |
|---|---|
IDENT |
Identifier: starts with a letter or _, followed by letters, digits, or _. Unicode letters are permitted. |
Operator Tokens
| Token | Lexeme | Description |
|---|---|---|
PLUS |
+ |
Addition / unary plus (not in grammar) |
MINUS |
- |
Subtraction / unary negation |
STAR |
* |
Multiplication / pointer dereference |
SLASH |
/ |
Division |
PERCENT |
% |
Modulo (remainder) |
AMP |
& |
Bitwise AND / address-of |
PIPE |
| |
Bitwise OR |
CARET |
^ |
Bitwise XOR |
BANG |
! |
Logical NOT |
TILDE |
~ |
Bitwise NOT |
DOT |
. |
Member access |
Keyword Tokens
Operator Keywords
| Lexeme | Description |
|---|---|
and |
Logical AND |
or |
Logical OR |
Boolean Literals
| Lexeme | Description |
|---|---|
true |
Boolean true value |
false |
Boolean false value |
Primitive Type Keywords
| Lexeme | Description |
|---|---|
u8 |
Unsigned 8-bit integer |
u16 |
Unsigned 16-bit integer |
u32 |
Unsigned 32-bit integer |
u64 |
Unsigned 64-bit integer |
i8 |
Signed 8-bit integer |
i16 |
Signed 16-bit integer |
i32 |
Signed 32-bit integer |
i64 |
Signed 64-bit integer |
f32 |
32-bit IEEE 754 floating-point |
f64 |
64-bit IEEE 754 floating-point |
bool |
Boolean (true or false) |
char |
Unicode scalar value (32-bit) |
Pointer Keyword
| Lexeme | Description |
|---|---|
opaque |
Used in *opaque to denote a pointer with no type info |
Statement Keywords
| Lexeme | Description |
|---|---|
let |
Introduces a variable binding |
mut |
Marks a binding or pointer as mutable |
return |
Exits the enclosing function |
if |
Conditional statement |
else |
Alternative branch of an if |
while |
Condition-controlled loop |
loop |
Infinite loop |
break |
Exit the immediately enclosing loop |
continue |
Skip to the next iteration of a loop |
Definition Keywords
| Lexeme | Description |
|---|---|
fn |
Introduces a function definition |
struct |
Introduces a struct definition |
Lexer note: All keywords above are reserved and must be recognised before the general
IDENTrule. An identifier may not shadow any keyword.
Delimiter / Punctuation Tokens
| Token | Lexeme | Description |
|---|---|---|
LPAREN |
( |
Left parenthesis |
RPAREN |
) |
Right parenthesis |
LBRACKET |
[ |
Left square bracket |
RBRACKET |
] |
Right square bracket |
COMMA |
, |
Argument / element separator |
SEMICOLON |
; |
Statement terminator / array size separator ([T; N]) |
LCURLY |
{ |
Block / compound expression open |
RCURLY |
} |
Block / compound expression close |
ARROW |
-> |
Function return type separator |
COLON |
: |
Type annotation separator |
Expressions
Expressions produce a value. The grammar defines them through a hierarchy of precedence levels — lower in the list means lower precedence (binds less tightly).
Operator Precedence Table
| Level | Operators | Associativity | Description |
|---|---|---|---|
| 1 | or |
left | Logical OR (lowest) |
| 2 | and |
left | Logical AND |
| 3 | | |
left | Bitwise OR |
| 4 | ^ |
left | Bitwise XOR |
| 5 | & |
left | Bitwise AND |
| 6 | + - |
left | Addition, subtraction |
| 7 | * / % |
left | Multiplication, division, modulo |
| 8 | ! ~ - * & |
right (unary) | Prefix unary operators |
| 9 | . […] (…) |
left (postfix) | Member access, index, call |
| 10 | literals, identifiers, () |
— | Primary expressions (highest) |
Operator Descriptions
Binary Operators
| Operator | Name | Example | Notes |
|---|---|---|---|
or |
Logical OR | a or b |
Short-circuits; both operands must be bool |
and |
Logical AND | a and b |
Short-circuits; both operands must be bool |
| |
Bitwise OR | a | b |
Integer types |
^ |
Bitwise XOR | a ^ b |
Integer types |
& |
Bitwise AND | a & b |
Integer types (binary context) |
+ |
Addition | a + b |
|
- |
Subtraction | a - b |
|
* |
Multiplication | a * b |
Binary context (both operands are values) |
/ |
Division | a / b |
Integer division truncates toward zero |
% |
Modulo | a % b |
Sign follows the dividend |
Unary Prefix Operators
| Operator | Name | Example | Notes |
|---|---|---|---|
! |
Logical NOT | !cond |
Operand must be bool |
~ |
Bitwise NOT | ~mask |
Bitwise complement; integer types |
- |
Negation | -x |
Arithmetic negation |
* |
Dereference | *ptr |
Unary context; operand must be a pointer type |
& |
Address-of | &x |
Unary context; produces a pointer to the operand |
Postfix Operators
| Operator | Name | Example | Notes |
|---|---|---|---|
. |
Member access | obj.field |
Accesses a named field or method of a struct/type |
[…] |
Subscript | arr[i] |
Indexes into an array, slice, or map |
(…) |
Call | f(a, b) |
Invokes a function or closure |
Disambiguation:
*and&are context-sensitive. When appearing as the first token of aunary_exprthey are unary (dereference / address-of). When appearing between twounary_exprsub-trees insidemultiplicative_exprorbitand_exprthey are binary (multiplication / bitwise AND). The parser resolves this purely from grammatical position — no look-ahead beyond 1 token is required.
Parenthesised Expressions
Any expression may be wrapped in parentheses to override default precedence:
(a + b) * c
Function Call Argument List
Arguments are comma-separated expressions. A trailing comma is not permitted at this grammar level.
f()
f(x)
f(x, y, z)
Examples
// Arithmetic
a + b * c - d % 2
// Bitwise
flags & MASK | extra ^ toggle
// Logical
ready and not_done or fallback
// Mixed unary / postfix
*ptr.field
&arr[i]
!cond
// Chained postfix
obj.method(arg1, arg2)[0].name
// Explicit precedence override
(a or b) and c
Types
Types describe the shape and interpretation of values. All type positions in
the grammar reference the type non-terminal.
Primitive Types
Primitive types are single-keyword types built into the language.
| Type | Kind | Width | Range / Notes |
|---|---|---|---|
u8 |
Unsigned integer | 8-bit | 0 … 255 |
u16 |
Unsigned integer | 16-bit | 0 … 65 535 |
u32 |
Unsigned integer | 32-bit | 0 … 4 294 967 295 |
u64 |
Unsigned integer | 64-bit | 0 … 2⁶⁴ − 1 |
i8 |
Signed integer | 8-bit | −128 … 127 |
i16 |
Signed integer | 16-bit | −32 768 … 32 767 |
i32 |
Signed integer | 32-bit | −2 147 483 648 … 2 147 483 647 |
i64 |
Signed integer | 64-bit | −2⁶³ … 2⁶³ − 1 |
f32 |
Floating-point | 32-bit | IEEE 754 single precision |
f64 |
Floating-point | 64-bit | IEEE 754 double precision |
bool |
Boolean | 1 byte | true or false |
char |
Unicode scalar | 32-bit | Any Unicode scalar value (not a surrogate) |
Named Types
A named type is any user-defined type referenced by its identifier — typically a struct name. Because all primitive-type keywords (u8, bool, etc.) are reserved, an IDENT in type position is always a named type, never a primitive.
Point // struct Point { x: f32, y: f32 }
Node // struct Node { value: i64, next: *Node }
*Point // pointer to a named type
[Node; 8] // array of a named type
Pointer Types
A pointer type is written with a leading *.
| Syntax | Description |
|---|---|
*T |
Typed pointer — points to a value of type T |
*opaque |
Opaque pointer — no compile-time pointee type information; equivalent to C's void * |
Pointer types may be nested: **u8 is a pointer to a pointer to u8.
*u8 // pointer to u8
**i32 // pointer to pointer to i32
*opaque // untyped pointer
**opaque // pointer to untyped pointer
Array Types
Arrays have a fixed size known at compile time.
[ <element-type> ; <size> ]
<size> must be a non-negative integer literal (INT_LIT). The element type
may itself be any type, including pointers or nested arrays.
[u8; 256] // array of 256 u8 values
[*u8; 4] // array of 4 pointers to u8
[[f32; 3]; 3] // 3×3 matrix of f32 (array of arrays)
[*opaque; 8] // array of 8 opaque pointers
Type Grammar Summary
type = primitive_type | named_type | pointer_type | array_type ;
primitive_type = "u8" | "u16" | "u32" | "u64"
| "i8" | "i16" | "i32" | "i64"
| "f32" | "f64" | "bool" | "char" ;
named_type = IDENT ;
pointer_type = "*" , ( "opaque" | type ) ;
array_type = "[" , type , ";" , INT_LIT , "]" ;
Struct Literals
A struct literal constructs a value of a named struct type by providing values for each field.
<TypeName> { <field>: <expr>, ... }
Fields may appear in any order and need not match the declaration order. No trailing comma is permitted.
Examples
let p = Point { x: 1.0, y: 2.0 };
let n = Node {
value: 42,
next: get_next()
};
// Nested struct literal
let outer = Rect {
origin: Point { x: 0.0, y: 0.0 },
size: Point { x: 10.0, y: 5.0 }
};
// Empty struct
let u = Unit {};
Struct Literals in Conditions
Struct literals are not permitted as the outermost expression in if and while conditions. This restriction exists because { after the condition is ambiguous — it could start a struct literal body or the statement block.
// ERROR — ambiguous: is `{` a struct body or the if block?
if Flags { verbose: true } { ... }
// OK — parentheses resolve the ambiguity
if (Flags { verbose: true }).verbose { ... }
The grammar enforces this through the expr_ns (no-struct) hierarchy used in condition positions. Struct literals remain valid everywhere else: let, return, function arguments, field values, etc.
Struct Literal Grammar Summary
primary_expr = IDENT , [ struct_lit_body ] | INT_LIT | FLOAT_LIT
| STRING_LIT | CHAR_LIT | "true" | "false"
| "(" , expr , ")" ;
struct_lit_body = "{" , struct_field_list , "}" ;
struct_field_list = [ struct_field , { "," , struct_field } ] ;
struct_field = IDENT , ":" , expr ;
No-Struct Expression (expr_ns)
expr_ns is a parallel expression hierarchy identical to expr except its primary level (primary_expr_ns) does not allow the struct_lit_body suffix after an IDENT. Struct literals are still permitted when enclosed in parentheses ("(" , expr , ")"), because the ( unambiguously marks the start of a grouped expression.
if_stmt and while_stmt use expr_ns for their condition; all other expression positions use the full expr.
Statements
Statements perform an action and do not produce a value. Each statement is
terminated by a semicolon ;.
Let Statement
Introduces a new named binding in the current scope.
let [mut] <name> [: <type>] [= <expr>] ;
| Part | Required | Description |
|---|---|---|
mut |
no | Makes the binding mutable; omit for immutable |
<name> |
yes | The identifier being bound |
: <type> |
no | Explicit type annotation |
= <expr> |
no | Initialiser expression |
; |
yes | Statement terminator |
Bindings are immutable by default. Attempting to assign to a binding
declared without mut is a compile-time error.
At least one of the type annotation or the initialiser must be present so the
compiler can determine the binding's type. This is a semantic constraint, not a
syntactic one — the grammar permits bare let x; and the type checker rejects
it if no type can be inferred from context.
Examples
// Immutable, type inferred from initialiser
let x = 42;
// Immutable, explicit type
let y: f64 = 3.14;
// Mutable, type inferred
let mut count = 0;
// Mutable, explicit type, no initialiser (must be assigned before use)
let mut buf: [u8; 128];
// Mutable pointer to u32
let mut ptr: *u32 = &value;
// Shadowing a previous binding is allowed
let x = "hello"; // x is now a string, previous x is gone
Return Statement
Exits the enclosing function immediately, optionally producing a return value.
return [<expr>] ;
return; (no expression) is used when the function's return type is the unit
type (). return <expr>; returns the value of the expression.
Explicit return is only needed for early exits. The idiomatic way to return a
value from a function is the implicit return of its body block.
return; // unit return
return 42; // return an integer
return x * 2 + 1; // return an expression
Expression Statement
Evaluates an expression for its side effects; the resulting value is discarded. A semicolon is required.
<expr> ;
do_something(x); // call for side effects
count + 1; // legal but silly — value discarded
Statement Grammar Summary
stmt = let_stmt | return_stmt | if_stmt
| while_stmt | loop_stmt | break_stmt | continue_stmt
| block_stmt | expr_stmt ;
let_stmt = "let" , [ "mut" ] , IDENT , [ ":" , type ] , [ "=" , expr ] , ";" ;
return_stmt = "return" , [ expr ] , ";" ;
if_stmt = "if" , expr_ns , block_stmt , [ "else" , else_branch ] ;
else_branch = if_stmt | block_stmt ;
while_stmt = "while" , expr_ns , block_stmt ;
loop_stmt = "loop" , block_stmt ;
break_stmt = "break" , ";" ;
continue_stmt = "continue" , ";" ;
block_stmt = "{" , { stmt } , "}" ;
expr_stmt = expr , ";" ;
If Statement
Conditionally executes a block based on a boolean expression.
if <cond> <block> [else <else-branch>]
The condition <cond> must be an expression of type bool. The body is
always a block_stmt — braces are mandatory.
Else Branch
The optional else branch is either a plain block or another if statement,
enabling else if chains of arbitrary length.
if x > 0 {
pos();
}
if x > 0 {
pos();
} else {
non_pos();
}
if x > 0 {
pos();
} else if x < 0 {
neg();
} else {
zero();
}
If Statement Grammar Summary
if_stmt = "if" , expr_ns , block_stmt , [ "else" , else_branch ] ;
else_branch = if_stmt | block_stmt ;
While Loop
Repeatedly executes a block as long as a boolean condition holds. The condition is tested before each iteration; if it is false on entry, the body never runs.
while <cond> <block>
let mut i = 0;
while i < 10 {
process(i);
i = i + 1;
}
While Loop Grammar Summary
while_stmt = "while" , expr_ns , block_stmt ;
Loop
Executes a block unconditionally and indefinitely. The loop runs until a
break or return inside the body transfers control out.
loop <block>
loop {
let msg = recv();
if msg.is_quit() {
break;
}
handle(msg);
}
Loop Grammar Summary
loop_stmt = "loop" , block_stmt ;
Break and Continue
break and continue are only valid inside the body of a while or loop.
The compiler enforces this as a semantic rule.
| Statement | Effect |
|---|---|
break ; |
Exits the immediately enclosing loop immediately |
continue ; |
Skips the rest of the current iteration; jumps to the next one |
For while, continue jumps back to the condition check. For loop,
continue jumps back to the top of the body.
let mut i = 0;
while i < 20 {
i = i + 1;
if i % 2 == 0 {
continue; // skip even numbers
}
if i > 15 {
break; // stop after 15
}
process(i);
}
Break / Continue Grammar Summary
break_stmt = "break" , ";" ;
continue_stmt = "continue" , ";" ;
Block Statement
A block groups zero or more statements into a single statement and introduces a new lexical scope. Blocks do not produce a value.
{ <stmt>* }
Scoping
Bindings declared inside a block are not visible outside it. A binding in an inner scope may shadow a name from an outer scope without affecting it.
let x = 1;
{
let x = 2; // shadows outer x inside this block only
f(x); // uses 2
}
// x is still 1 here
Nesting
Blocks may be nested freely to any depth.
{
let a = compute_a();
{
let b = compute_b();
use(a, b);
}
// b is no longer in scope here
}
Block Grammar Summary
block = "{" , { stmt } , "}" ;
Top-Level Definitions
A Flux source file is a sequence of top-level definitions.
program = { top_level_def } ;
top_level_def = func_def | struct_def ;
The leading token unambiguously selects the definition kind: fn → function,
struct → struct.
Function Definition
Defines a named, callable function.
fn <name> ( [<params>] ) [-> <return-type>] <block>
| Part | Required | Description |
|---|---|---|
<name> |
yes | The function's identifier |
( [<params>] ) |
yes | Comma-separated parameter list, may be empty |
-> <return-type> |
no | Return type; omitting it means the function returns () |
<block> |
yes | Function body — a block_stmt |
Parameters
Each parameter is a name with a mandatory type annotation. Parameters are
immutable by default; mut makes the local binding mutable within the body.
[mut] <name> : <type>
fn add(a: i32, b: i32) -> i32 {
return a + b;
}
fn greet(name: *u8) {
print(name);
}
fn increment(mut x: i32) -> i32 {
x = x + 1;
return x;
}
fn apply(f: *opaque, mut buf: [u8; 64]) -> bool {
return call(f, &buf);
}
Return Type
If -> is omitted the return type is implicitly () (the unit type). An
explicit -> () is also permitted but redundant.
fn do_work() { // returns ()
side_effect();
}
fn get_value() -> i64 { // returns i64
return 42;
}
Function Definition Grammar Summary
func_def = "fn" , IDENT , "(" , param_list , ")" , [ "->" , type ] , block_stmt ;
param_list = [ param , { "," , param } ] ;
param = [ "mut" ] , IDENT , ":" , type ;
Struct Definition
Defines a named product type with zero or more typed fields.
struct <name> {
<field>: <type>,
...
}
Fields are separated by commas. No trailing comma is permitted. An empty struct (zero fields) is valid.
Fields
Each field is a name and a type. Fields may be of any type including pointers, arrays, and other structs. Field names must be unique within the struct.
struct Point {
x: f32,
y: f32
}
struct Node {
value: i64,
next: *Node
}
struct Buffer {
data: *u8,
len: u64,
cap: u64
}
struct Unit {}
Member Access
Fields of a struct value are accessed with the . operator (defined in the
expression grammar). If the value is behind a pointer, dereference it first
with *.
let p: Point = make_point();
let x = p.x;
let ptr: *Point = get_point_ptr();
let y = (*ptr).y;
Struct Definition Grammar Summary
struct_def = "struct" , IDENT , "{" , field_list , "}" ;
field_list = [ field , { "," , field } ] ;
field = IDENT , ":" , type ;