Skip to content

Design: per-file symbol namespaces for parallel compilation #610

@antonsynd

Description

@antonsynd

Background

The Sharpy compiler currently processes all files sequentially, sharing a single SymbolTable instance. This prevents parallelizing compilation across files. From the 2026-04-27 compiler health audit (W2).

Current State

SymbolTable threading issues

  • SymbolTable uses non-concurrent Dictionary<string, Scope> for _moduleScopes and Stack<Scope> for _scopeStack
  • ProjectCompiler creates one _symbolTableBacking instance shared across all files
  • Comment at ProjectCompiler.cs:25-27 documents the sequential constraint

Mutation phases

Shared SymbolTable is mutated during:

  1. Phase 3 — Name resolution: NameResolver.ResolveDeclarations() adds symbols to global/module scopes
  2. Phase 4 — Import resolution: ImportResolver registers imported symbols, requires cross-file visibility
  3. Phase 4b — Inheritance resolution: NameResolver.ResolveInheritance() sets BaseType/Interfaces on TypeSymbols
  4. Phase 5 — Type checking: TypeChecker.CheckModule() sets Type on VariableSymbol, records SemanticInfo

Key constraint

Import resolution (Phase 4) requires cross-file visibility — file A must see symbols from file B to resolve from B import X. This creates an ordering dependency that prevents naive per-file parallelism.

Proposed Approach

Per-file SymbolTable with merge

  1. Phase 3 (parallel): Each file gets its own SymbolTable. NameResolver populates file-local scopes independently.
  2. Merge point: After all files complete Phase 3, merge per-file symbol tables into a shared read-only GlobalSymbolTable. This is the synchronization barrier.
  3. Phase 4 (sequential or parallel with concurrent reads): Import resolution reads from the merged GlobalSymbolTable. Could be parallelized if imports only read from global and write to file-local scope.
  4. Phase 4b (sequential): Inheritance resolution modifies TypeSymbol properties — likely must stay sequential unless symbols are immutable-until-frozen.
  5. Phase 5 (parallel): Type checking reads from global symbols + writes to per-file SemanticInfo. Already produces per-file SemanticInfo — could parallelize with no shared writes.

Alternative: Concurrent SymbolTable

Replace Dictionary/Stack with ConcurrentDictionary/ConcurrentStack. Simpler but introduces lock contention and subtle ordering bugs.

Recommendation: Per-file + merge approach. More work upfront but cleaner threading model and easier to reason about correctness.

Key Design Decisions Needed

  1. Scope stack: The _scopeStack is inherently sequential (push/pop during tree traversal). Each parallel file needs its own stack — this argues for per-file SymbolTable instances.
  2. Symbol identity: Symbols use reference equality. After merge, all files must reference the same Symbol instances for shared types. The merge must canonicalize, not copy.
  3. Incremental compilation interaction: The IncrementalCompilationCache serializes symbols per-file. Per-file symbol tables align naturally with this.
  4. Error accumulation: DiagnosticBag is not thread-safe. Each file needs its own bag, merged after completion.

Dependencies

  • No blocking dependencies
  • Incremental compilation cache format (SymbolSerializer) may need schema version bump
  • LSP LanguageService assumes sequential compilation — needs update

Estimated Effort

3-4 weeks for a senior engineer:

  • Week 1: Per-file SymbolTable refactor + merge logic
  • Week 2: Import resolution with cross-file reads
  • Week 3: Parallel dispatch + DiagnosticBag per-file
  • Week 4: Testing, incremental cache update, LSP integration

Out of Scope

  • Actual Parallel.ForEach / Task.WhenAll dispatch (implementation)
  • Benchmark harness for measuring speedup
  • Thread-safe DiagnosticBag implementation

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions