Skip to content

Dogfood: Add deterministic post-processing repair pass #365

@antonsynd

Description

@antonsynd

Summary

The dogfood pipeline currently relies entirely on LLM prompting to generate valid Sharpy code. Despite extensive prompt engineering, ~30% of failures are the LLM making mechanical mistakes (wrong decorator syntax, missing self, instance fields in static classes, etc.). Prompting has hit diminishing returns — the LLM "knows" the rules but inconsistently applies them.

Proposal: Add a deterministic repair pass between LLM generation and compilation that mechanically fixes known error patterns using the Sharpy parser itself.

Design

Pipeline change

Current:  LLM generates → compile → run → compare output
Proposed: LLM generates → PARSE WITH SHARPY → REPAIR AST → compile → run → compare output

Repair rules (ordered by dogfood impact)

Rule Pattern Fix Dogfood items
Decorator line split @abstract class X: on one line Split to @abstract\nclass X: 5+
Python decorator rename @staticmethod Replace with @static 1+
Static class field fix Plain field in @static class Add @static decorator 4
Missing self on property property get foo() -> T: Add self parameter 1+

Expected output validation

For C2 failures (wrong expected output), run the generated code through Python first where semantics overlap. Use Python's stdout as the expected output instead of trusting the LLM's prediction. This eliminates float formatting mismatches (42.0 vs 42) and computation errors.

Early skip

If the generated code fails to parse even after repair, skip immediately instead of wasting 3 retry attempts.

Impact

  • Eliminates ~60% of C1 (prompting) failures mechanically
  • Eliminates most C2 (output mismatch) failures via Python oracle
  • Improves signal-to-noise ratio from ~30% compiler bugs to ~60%+
  • Meta-dogfooding: uses the Sharpy parser as part of the pipeline

Implementation

  • Location: build_tools/ Python pipeline
  • Can call sharpyc emit ast to parse and detect issues
  • Repair rules are simple string/regex transforms on known patterns
  • Python oracle is a subprocess call to python3

Discovered via

Dogfood analysis session 2026-03-10 — observed that prompting has hit diminishing returns for code generation quality.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions