Skip to content

docs(instructions): ground before skipping — a failing test/example is a signal#33

Merged
mobileskyfi merged 1 commit into
mainfrom
docs/grounding-and-anti-skip-instructions
Jun 27, 2026
Merged

docs(instructions): ground before skipping — a failing test/example is a signal#33
mobileskyfi merged 1 commit into
mainfrom
docs/grounding-and-anti-skip-instructions

Conversation

@mobileskyfi

Copy link
Copy Markdown
Contributor

Why

In the #22 close-out I mistook a single Extended Verification run for a proven QEMU limitation ("snapshots don't work on aarch64") and "fixed" it by arch-gating the rollback example — without ever reproducing it locally, even though quickchr runs arm64 CHR under TCG on this Intel host. That masked a real signal: the example was doing its job (a canary catching what unit/integration tests miss) and I silenced it, then spread the unproven cause across DESIGN.md, API docs, BACKLOG, an issue, and nearly into a shared SKILL.

That masking PR is on hold (#32, draft); actually grounding the snapshot behavior is #31. This PR fixes the instructions so the class of error is harder to repeat.

What was missing

The anti-masking discipline already existed (testing.instructions.md: "Never increase a timeout as a first fix. It masks the root cause") — but it was siloed to test/**, framed only around timeout-bumps, and never named the most destructive masking move: skip / os-gate / arch-gate, which removes the signal permanently. Nor did anything require reproducing a claimed platform limitation locally before recording it.

Changes (instructions only — no code)

  • examples.instructions.md — new section: examples are canaries; a failing example is a quickchr bug until a local repro proves otherwise; never gate a failure as the first move; don't write an unproven cause into DESIGN/API docs/BACKLOG/issues/skills.
  • testing.instructions.md — new subsection beside the timeout rule: skip/gating is the worst masking move (a bumped timeout still runs the test; a skip deletes it — and filing an issue alongside doesn't make it safe); "platform limitation" is a hypothesis until reproduced locally (we boot arm64 under TCG here); one CI failure must not cascade.
  • ci.instructions.md — top-of-file principle: one CI run is a signal, not a fact — reproduce locally, don't let one red run cascade.

The three cross-reference each other. bun run check green (cspell, markdownlint).

Companion changes (your call, per the scope question): a general agentic rule in ~/CLAUDE.md (not version-controlled — applied as a local edit), and a "don't record CI-only/unverified findings in a SKILL" rule in tikoci/routeros-skills (separate PR).

🤖 Generated with Claude Code

…e skipping

Codifies the discipline that was missing when a single Extended Verification run
got mistaken for a proven arm64 QEMU limitation and "fixed" by arch-gating the
rollback example (held in #32, real grounding tracked in #31).

The anti-masking rules already existed but were siloed in testing.instructions.md
and framed only around timeout-bumps; the most dangerous masking move — skip /
os-gate / arch-gate — was never named, and nothing required reproducing a claimed
platform limitation locally (which we can: arm64 CHR runs under TCG on Intel).

- examples.instructions.md: examples are canaries; a failing example is a quickchr
  bug until a LOCAL repro proves otherwise; never gate a failure as the first move;
  don't record an unproven cause in DESIGN/API docs/BACKLOG/issues/skills.
- testing.instructions.md: name skip/gating as the worst masking move (worse than a
  timeout-bump); "platform limitation" is a hypothesis until reproduced locally; one
  CI failure must not cascade into doc/skill edits.
- ci.instructions.md: one CI run is a signal, not a fact — reproduce locally, don't
  let one red run cascade.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings June 27, 2026 15:10
@coderabbitai

coderabbitai Bot commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

Warning

Review limit reached

@mobileskyfi, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 26 minutes and 23 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits.

🚦 How do rate limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: 65fef965-b4b3-48a2-915e-ad241298c14e

📥 Commits

Reviewing files that changed from the base of the PR and between 3f57daf and 914d284.

📒 Files selected for processing (3)
  • .github/instructions/ci.instructions.md
  • .github/instructions/examples.instructions.md
  • .github/instructions/testing.instructions.md
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch docs/grounding-and-anti-skip-instructions

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the repository’s contribution instructions to treat failing tests/examples and red CI runs as signals to investigate, explicitly discouraging “masking” fixes (skip / OS-gating / arch-gating) until a local reproduction confirms an unfixable platform limitation.

Changes:

  • Adds a new “failing test is a signal” subsection to testing instructions, extending the existing “don’t bump timeouts first” discipline to skips/gates.
  • Adds an “examples are canaries” section to examples instructions, framing example failures as potential quickchr bugs until locally reproduced otherwise.
  • Adds a top-level CI principle: a single red run is a signal, and should not trigger cascading code/doc/skill edits or masking gates without grounding.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
.github/instructions/testing.instructions.md Extends anti-masking guidance from timeout bumps to skips/gates; emphasizes local repro before calling something a platform limit.
.github/instructions/examples.instructions.md Establishes examples as canaries; forbids skip/os/arch gating as a first reaction to failures.
.github/instructions/ci.instructions.md Adds a top-of-file principle to treat one CI run as a signal and avoid cascaded, ungrounded changes.

@mobileskyfi mobileskyfi merged commit 2eb5866 into main Jun 27, 2026
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants