Files

3.7 KiB

Compiler Roadmap

A Rust-flavored, C-targeting language - built pipeline-first.

Implementation language: Rust
Code generation target: Native object files (.o) via Cranelift JIT/AOT

Phase 1 - Lexer

  • Define token enum (int literal, bool literal, ident, keywords, operators, punctuation)
  • Implement character-by-character scanner loop
  • Handle whitespace & single-line comments (//)
  • Produce source spans (file, line, col) on every token
  • Unit-test: known inputs → expected token streams

Phase 2 - Parser

  • Write grammar for the base subset
    • fn declarations, return, int/bool literals
    • Arithmetic (+, -, *, /)
  • Implement recursive-descent parser
  • Attach source spans to every AST node
  • Emit structured parse errors with span info
  • Unit-test: parse valid snippets, expect correct AST shapes

Phase 3 - Semantic Analysis

  • Implement scope-aware symbol table (environment)
  • Name resolution pass - resolve all Ident nodes to their declarations
  • Hindley-Milner type inference (unification, type variables, occurs check)
  • Integer literal sizing and unary minus type promotion logic
  • Translate untyped AST directly into a fully-typed AST (Typed AST)
  • Validate function return types match declared signature
  • Error on use-before-declaration, undeclared symbols, and type mismatches
  • Unit-test: HM unification, type mappings, and ill-typed program diagnostics

Phase 4 - Code Generation via Cranelift

  • Integrate cranelift-codegen, cranelift-frontend, and cranelift-object
  • Implement CLI with clap (--emit-ir flag, input/output files)
  • Map Typed AST types (Ty) to Cranelift IR types
  • Lower functions, parameters, and variable definitions to Cranelift IR
  • Codegen for arithmetic, unary operations, and return statements
  • Run built-in optimization passes (constant folding, e-graphs, DCE)
  • Output System V AMD64 ABI-compliant .o machine code files
  • End-to-end test: compile a simple fn → link via gcc → run → correct exit code

Phase 5 - The Ray Tracer Milestone (Current)

To successfully write a simple ray tracer, we need continuous math, data structures, and I/O. The following path establishes these prerequisites:

  • Floating-Point Support: Add f32/f64 types, decimal literals, and Cranelift lowering for fadd, fmul, etc.
  • FFI & Interop: Implement extern fn declarations to bind C standard library functions (like putchar or printf) for .ppm image output.
  • Type Casting: Add the as operator to convert floating-point color bounds [0.0, 1.0] into integer byte formats [0, 255].
  • Pointers: Add pointer types (*T), address-of (&), and dereference (*) operators.
  • Structs: Add struct definitions, initializers, and field access (ray.origin.x) to represent 3D vectors, rays, and spheres.
  • Arrays: Add fixed-size arrays ([T; N]) or heap allocations for the scene and framebuffers.

Planned Features (Backlog)

Control flow & Variables

  • booleans and comparision operators
  • if / else branching
  • while loops, break, continue
  • let bindings and variable assignments

Types & memory

  • Opaque pointers (*void / *opaque)
  • Raw pointer arithmetic
  • Slices (&[T] / []T)

Composite types

  • Enums (C-style tagged unions)
  • Pattern matching (match / switch)

Strings & interop

  • String literals
  • Variadic functions (for printf interop)

Tooling & backend

  • Proper register allocator
  • Debug info (DWARF)
  • Standard library bootstrap (print, malloc wrapper)