# Compiler Roadmap A Rust-flavored, C-targeting language - built pipeline-first. **Implementation language:** Rust **Code generation target:** x86-64 (AT&T / Intel syntax `.s` → assembled via GAS or NASM) ## Phase 1 - Lexer - [x] Define token enum (int literal, bool literal, ident, keywords, operators, punctuation) - [x] Implement character-by-character scanner loop - [x] Handle whitespace & single-line comments (`//`) - [x] Produce source spans (file, line, col) on every token - [x] Unit-test: known inputs → expected token streams ## Phase 2 - Parser - [ ] Write grammar for the base subset - `fn` declarations, `return`, `let`, int/bool literals - Arithmetic (`+`, `-`, `*`, `/`), comparison (`==`, `!=`, `<`, `>`) - Function call expressions - [ ] Implement recursive-descent parser - [ ] Build typed AST: `FnDecl`, `Block`, `ReturnStmt`, `LetStmt`, `BinExpr`, `CallExpr`, `Literal`, `Ident` - [ ] Attach source spans to every AST node - [ ] Emit structured parse errors with span info - [ ] Unit-test: parse valid snippets, expect correct AST shapes ## Phase 3 - Semantic Analysis - [ ] Implement scope-aware symbol table - [ ] Name resolution pass - resolve all `Ident` nodes to their declarations - [ ] Type inference / checking for `int` and `bool` - [ ] Validate function return types match declared signature - [ ] Error on use-before-declaration and undeclared symbols - [ ] Unit-test: ill-typed programs produce correct diagnostics ## Phase 4 - x86-64 Code Generation - [ ] Design a simple intermediate representation (linear IR, or use AST directly) - [ ] Implement stack-frame layout for local variables - [ ] Emit System V AMD64 ABI-compliant function prologues / epilogues - [ ] Codegen for arithmetic & comparison expressions - [ ] Codegen for function calls (argument passing via registers) - [ ] Codegen for `return` statements - [ ] Output `.s` file, assemble with NASM / GAS - [ ] End-to-end test: compile a simple `fn` → run → correct exit code ## Planned Features (Backlog) ### Control flow - [ ] `if` / `else` branching - [ ] `while` loops ### Types & memory - [ ] Typed pointers (`*T`) - [ ] Opaque pointers (`*void` / `*opaque`) - [ ] Raw pointer arithmetic & dereference - [ ] Fixed-size arrays (`[T; N]`) - [ ] Slices (`&[T]` / `[]T`) ### Composite types - [ ] Structs & field access - [ ] Enums (C-style tagged unions) - [ ] Pattern matching (`match` / `switch`) ### Strings & interop - [ ] String literals & `*u8` handling - [ ] Variadic functions (for `printf` interop) - [ ] `extern` / FFI declarations ### Tooling & backend - [ ] Proper register allocator - [ ] Debug info (DWARF) - [ ] Standard library bootstrap (`print`, `malloc` wrapper)