Problem
The dogfood pipeline identifies correct rejections (C4 category) — cases where the compiler rightfully rejects invalid code — but discards them. These are valuable negative test cases that could be automatically converted into .spy + .error test fixtures.
Current State
- ~95+ hand-crafted
.error test fixtures exist in src/Sharpy.Compiler.Tests/Integration/TestFixtures/errors/
- The dogfood pipeline classifies correct rejections but doesn't persist them as test fixtures
- Example:
skip_dunder_comparison_0004 correctly rejected __hash__ without __eq__(self, other: object) — this pattern already has a hand-crafted test (dunder_hash_without_eq_object.spy), but many correct rejections may cover untested error paths
Proposed Feature
When the dogfood pipeline classifies a failure as a correct rejection (C4), automatically:
- Extract the minimal
.spy source that triggers the error
- Extract the expected error substring from the compiler output
- Write a
.spy + .error pair to a staging directory (e.g., dogfood_output/generated_fixtures/)
- Optionally deduplicate against existing fixtures (check if the error code + pattern is already covered)
This would grow the negative test suite organically from real-world code patterns the AI generates.
Component
- Affected: Dogfood tooling (
build_tools/sharpy_dogfood/)
- Priority: Low — nice-to-have for test coverage growth
Discovered
Dogfood analysis 2026-02-17
Problem
The dogfood pipeline identifies correct rejections (C4 category) — cases where the compiler rightfully rejects invalid code — but discards them. These are valuable negative test cases that could be automatically converted into
.spy+.errortest fixtures.Current State
.errortest fixtures exist insrc/Sharpy.Compiler.Tests/Integration/TestFixtures/errors/skip_dunder_comparison_0004correctly rejected__hash__without__eq__(self, other: object)— this pattern already has a hand-crafted test (dunder_hash_without_eq_object.spy), but many correct rejections may cover untested error pathsProposed Feature
When the dogfood pipeline classifies a failure as a correct rejection (C4), automatically:
.spysource that triggers the error.spy+.errorpair to a staging directory (e.g.,dogfood_output/generated_fixtures/)This would grow the negative test suite organically from real-world code patterns the AI generates.
Component
build_tools/sharpy_dogfood/)Discovered
Dogfood analysis 2026-02-17