# Compiler Roadmap A Rust-flavored, C-targeting language - built pipeline-first. **Implementation language:** Rust **Code generation target:** Native object files (`.o`) via Cranelift JIT/AOT ## Phase 1 - Lexer - [x] Define token enum (int literal, bool literal, ident, keywords, operators, punctuation) - [x] Implement character-by-character scanner loop - [x] Handle whitespace & single-line comments (`//`) - [x] Produce source spans (file, line, col) on every token - [x] Unit-test: known inputs → expected token streams ## Phase 2 - Parser - [x] Write grammar for the base subset - `fn` declarations, `return`, int/bool literals - Arithmetic (`+`, `-`, `*`, `/`) - [x] Implement recursive-descent parser - [x] Attach source spans to every AST node - [x] Emit structured parse errors with span info - [x] Unit-test: parse valid snippets, expect correct AST shapes ## Phase 3 - Semantic Analysis - [x] Implement scope-aware symbol table (environment) - [x] Name resolution pass - resolve all `Ident` nodes to their declarations - [x] Hindley-Milner type inference (unification, type variables, occurs check) - [x] Integer literal sizing and unary minus type promotion logic - [x] Translate untyped AST directly into a fully-typed AST (Typed AST) - [x] Validate function return types match declared signature - [x] Error on use-before-declaration, undeclared symbols, and type mismatches - [x] Unit-test: HM unification, type mappings, and ill-typed program diagnostics ## Phase 4 - Code Generation via Cranelift - [x] Integrate `cranelift-codegen`, `cranelift-frontend`, and `cranelift-object` - [x] Implement CLI with `clap` (`--emit-ir` flag, input/output files) - [x] Map Typed AST types (`Ty`) to Cranelift IR types - [x] Lower functions, parameters, and variable definitions to Cranelift IR - [x] Codegen for arithmetic, unary operations, and `return` statements - [x] Run built-in optimization passes (constant folding, e-graphs, DCE) - [x] Output System V AMD64 ABI-compliant `.o` machine code files - [x] End-to-end test: compile a simple `fn` → link via `gcc` → run → correct exit code ## Planned Features (Backlog) ### Control flow - [x] booleans and comparision operators - [ ] `if` / `else` branching - [ ] `while` loops ### Types & memory - [ ] Typed pointers (`*T`) - [ ] Opaque pointers (`*void` / `*opaque`) - [ ] Raw pointer arithmetic & dereference - [ ] Fixed-size arrays (`[T; N]`) - [ ] Slices (`&[T]` / `[]T`) ### Composite types - [ ] Structs & field access - [ ] Enums (C-style tagged unions) - [ ] Pattern matching (`match` / `switch`) ### Strings & interop - [ ] String literals & `*u8` handling - [ ] Variadic functions (for `printf` interop) - [ ] `extern` / FFI declarations ### Tooling & backend - [ ] Proper register allocator - [ ] Debug info (DWARF) - [ ] Standard library bootstrap (`print`, `malloc` wrapper)