commit 46028f9072776be61dcd9c4eba675a6bfc406e8c Author: Jooris Hadeler Date: Mon Apr 20 18:11:14 2026 +0200 init: add compiler roadmap and project planning diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000..ea8c4bf --- /dev/null +++ b/.gitignore @@ -0,0 +1 @@ +/target diff --git a/Cargo.lock b/Cargo.lock new file mode 100644 index 0000000..2245ad6 --- /dev/null +++ b/Cargo.lock @@ -0,0 +1,7 @@ +# This file is automatically @generated by Cargo. +# It is not intended for manual editing. +version = 4 + +[[package]] +name = "compiler" +version = "0.1.0" diff --git a/Cargo.toml b/Cargo.toml new file mode 100644 index 0000000..1170539 --- /dev/null +++ b/Cargo.toml @@ -0,0 +1,6 @@ +[package] +name = "compiler" +version = "0.1.0" +edition = "2024" + +[dependencies] diff --git a/PLAN.md b/PLAN.md new file mode 100644 index 0000000..d543010 --- /dev/null +++ b/PLAN.md @@ -0,0 +1,74 @@ +# Compiler Roadmap + +A Rust-flavored, C-targeting language - built pipeline-first. + +**Implementation language:** Rust +**Code generation target:** x86-64 (AT&T / Intel syntax `.s` → assembled via GAS or NASM) + +## Phase 1 - Lexer + +- [ ] Define token enum (int literal, bool literal, ident, keywords, operators, punctuation) +- [ ] Implement character-by-character scanner loop +- [ ] Handle whitespace & single-line comments (`//`) +- [ ] Produce source spans (file, line, col) on every token +- [ ] Unit-test: known inputs → expected token streams + +## Phase 2 - Parser + +- [ ] Write grammar for the base subset + - `fn` declarations, `return`, `let`, int/bool literals + - Arithmetic (`+`, `-`, `*`, `/`), comparison (`==`, `!=`, `<`, `>`) + - Function call expressions +- [ ] Implement recursive-descent parser +- [ ] Build typed AST: `FnDecl`, `Block`, `ReturnStmt`, `LetStmt`, `BinExpr`, `CallExpr`, `Literal`, `Ident` +- [ ] Attach source spans to every AST node +- [ ] Emit structured parse errors with span info +- [ ] Unit-test: parse valid snippets, expect correct AST shapes + +## Phase 3 - Semantic Analysis + +- [ ] Implement scope-aware symbol table +- [ ] Name resolution pass - resolve all `Ident` nodes to their declarations +- [ ] Type inference / checking for `int` and `bool` +- [ ] Validate function return types match declared signature +- [ ] Error on use-before-declaration and undeclared symbols +- [ ] Unit-test: ill-typed programs produce correct diagnostics + +## Phase 4 - x86-64 Code Generation + +- [ ] Design a simple intermediate representation (linear IR, or use AST directly) +- [ ] Implement stack-frame layout for local variables +- [ ] Emit System V AMD64 ABI-compliant function prologues / epilogues +- [ ] Codegen for arithmetic & comparison expressions +- [ ] Codegen for function calls (argument passing via registers) +- [ ] Codegen for `return` statements +- [ ] Output `.s` file, assemble with NASM / GAS +- [ ] End-to-end test: compile a simple `fn` → run → correct exit code + +## Planned Features (Backlog) + +### Control flow +- [ ] `if` / `else` branching +- [ ] `while` loops + +### Types & memory +- [ ] Typed pointers (`*T`) +- [ ] Opaque pointers (`*void` / `*opaque`) +- [ ] Raw pointer arithmetic & dereference +- [ ] Fixed-size arrays (`[T; N]`) +- [ ] Slices (`&[T]` / `[]T`) + +### Composite types +- [ ] Structs & field access +- [ ] Enums (C-style tagged unions) +- [ ] Pattern matching (`match` / `switch`) + +### Strings & interop +- [ ] String literals & `*u8` handling +- [ ] Variadic functions (for `printf` interop) +- [ ] `extern` / FFI declarations + +### Tooling & backend +- [ ] Proper register allocator +- [ ] Debug info (DWARF) +- [ ] Standard library bootstrap (`print`, `malloc` wrapper) \ No newline at end of file diff --git a/examples/simple.src b/examples/simple.src new file mode 100644 index 0000000..98f5b50 --- /dev/null +++ b/examples/simple.src @@ -0,0 +1,3 @@ +fn main() -> i32 { + return 0; +} \ No newline at end of file diff --git a/src/main.rs b/src/main.rs new file mode 100644 index 0000000..e7a11a9 --- /dev/null +++ b/src/main.rs @@ -0,0 +1,3 @@ +fn main() { + println!("Hello, world!"); +}