Skip to content

feat(skill): support interactive repair when reviewer validation fails#292

Merged
wwind123 merged 6 commits into
mainfrom
feat/291-retry-validate
Jun 12, 2026
Merged

feat(skill): support interactive repair when reviewer validation fails#292
wwind123 merged 6 commits into
mainfrom
feat/291-retry-validate

Conversation

@wwind123

Copy link
Copy Markdown
Owner

Summary

  • Adds _save_raw_to_repair_dir: before any normalization, copies the raw reviewer response, context.json, and prior_items.json to a stable path at /tmp/coding-review-agent-loop/repair/{issue}-r{N}-{agent}/ with a manifest.json encoding all round parameters.
  • Extracts _complete_reviewer_turn: the shared normalize→validate→render→parse→mint-IDs→attach-metadata→post pipeline, now called by both _run_reviewer and the new retry-validate subcommand.
  • Adds retry-validate --repair-dir PATH [--dry-run] subcommand: reads manifest.json and prior_items.json, calls _complete_reviewer_turn on the (user-edited) raw.md, and posts the completed review.
  • On validation failure, skill_runner now prints the repair dir path and the exact retry command instead of just exiting 1.
  • 3 new tests: repair dir creation after run-plan-round, successful retry-validate --dry-run, invalid raw exits non-zero.

Test plan

  • All 20 tests pass: python -m pytest tests/test_skill_helpers.py -q
  • retry-validate --help shows the new subcommand
  • After a validation failure, the repair dir is present with raw.md, context.json, prior_items.json, and manifest.json

Closes #291

🤖 Generated with Claude Code

#291)

Add `_save_raw_to_repair_dir` to preserve the raw reviewer response, context,
and prior-items before normalization in a stable path under
`/tmp/coding-review-agent-loop/repair/{issue}-r{N}-{agent}/`. Extract
`_complete_reviewer_turn` as the shared normalize→validate→render→parse→post
pipeline called by both `_run_reviewer` and the new `retry-validate` subcommand.

On validation failure, skill_runner prints the repair dir path and the exact
`retry-validate --repair-dir` command needed to complete the round after
manually fixing the raw response.

Add 3 new tests covering repair dir creation, successful retry, and invalid
raw file rejection.

Closes #291

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@wwind123

Copy link
Copy Markdown
Owner Author

Review verdict: Blocking

The PR introduces a valuable feature for interactive repair of failed reviewer turns. However, there is a critical undefined variable bug in the error handling logic of the extracted pipeline, and a likely missing import in skill_runner.py.

Blocking issues

  • new_unresolved_texts is undefined in _complete_reviewer_turn. This will cause a NameError if a reviewer returns a blocking state without any items in the blocking_items field. This path appears to be untested.
  • The tempfile module is used to define _REPAIR_BASE at the top level of helpers/skill_runner.py but does not appear to be imported. This will cause a NameError at module load time if not already present in the file.

Same-PR follow-ups

  • Move import shutil to the top of helpers/skill_runner.py for consistency with other imports.
  • In cmd_retry_validate, the dry_run logic manifest.get("dry_run", False) or args.dry_run makes it impossible to 'promote' a repair from a dry-run to a real-run (post) without manually editing manifest.json. Consider allowing args.dry_run to be an optional override in both directions.
  • Add a test case for retry-validate where the reviewer state is blocking to verify the ID minting and metadata attachment logic in the repair flow.

-- Google Gemini

@wwind123

Copy link
Copy Markdown
Owner Author

Review verdict: Blocking

The interactive repair flow is close, but the validation-failure handler is too broad and can mis-handle successful or partially successful review completion failures.

Blocking issues

  • helpers/skill_runner.py:684 catches every SystemExit from _complete_reviewer_turn, but only validation failures should print the repair workflow. If render_response, attach-metadata, post-issue-comment, or clear-pending-comment exits non-zero after validation succeeds, the user is told to edit raw.md and run retry-validate even though the raw response may be valid and the failure may have occurred after writing/posting. In the worst case, post-issue-comment succeeds but a later cleanup helper fails, and following the printed retry command can post the same completed review again because retry-validate does not reconcile pending/already-posted comments. Limit the repair-dir retry instructions to validation failures or make downstream failures preserve/reconcile idempotently before suggesting retry-validate.

-- OpenAI Codex

…un_reviewer

Use a dedicated _ValidationError exception raised by _complete_reviewer_turn
on validation failure, so _run_reviewer only prints the repair-dir retry hint
for that specific failure path. Post-validation failures (render, attach-metadata,
post) no longer trigger misleading repair instructions.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@wwind123

Copy link
Copy Markdown
Owner Author

Changes since round 1

Item 1 and 2 are false positives:

  • new_unresolved_texts is defined at line 548 (new_unresolved_texts = list(review_json.get(same_key, []))) and used at lines 554 and 572.
  • tempfile is imported at line 42 (import tempfile).

Item 3 is valid and fixed:

  • Introduced a dedicated _ValidationError exception class. _complete_reviewer_turn now raises _ValidationError (not sys.exit(1)) when the validate_response helper returns non-zero. _run_reviewer catches only _ValidationError to print the repair-dir hint and then exits 1. All other failures (render, attach-metadata, post) propagate as SystemExit without triggering the misleading repair-dir instructions.
  • cmd_retry_validate similarly catches _ValidationError and exits 1 cleanly.

-- Anthropic Claude

@wwind123

Copy link
Copy Markdown
Owner Author

Review verdict: Blocking

The PR introduces a helpful repair workflow but fails to address several critical bugs identified in the previous round, most notably undefined variables in the new _complete_reviewer_turn function and a missing tempfile import. These issues will cause NameError exceptions during execution and at module load time. Additionally, several smaller follow-up items regarding import placement, dry-run logic, and test coverage remain unresolved.

Blocking issues

  • In _complete_reviewer_turn, the variables blocking_texts and new_unresolved_texts (and subsequently reported_blocking) are used but never defined. These were likely lost during the extraction from _run_reviewer. This will cause a NameError whenever a reviewer turn is processed, particularly in 'blocking' states.
  • The tempfile module is used to define _REPAIR_BASE at the top level of helpers/skill_runner.py but is not imported, which will cause a NameError at module load time.

Same-PR follow-ups

  • Move import shutil to the top of helpers/skill_runner.py for consistency with other imports.
  • In cmd_retry_validate, the dry_run logic manifest.get("dry_run", False) or args.dry_run prevents promoting a dry-run repair to a real post. Consider allowing args.dry_run to be an override in both directions.
  • Add a test case for retry-validate where the reviewer state is blocking to verify ID minting and metadata attachment logic in the repair flow.

Prior unresolved item dispositions

  • [item-1] Blocking issue from Google Gemini, round 1: new_unresolved_texts is undefined in _complete_reviewer_turn. This will cause a NameError i... -> still blocking
  • [item-2] Blocking issue from Google Gemini, round 1: The tempfile module is used to define _REPAIR_BASE at the top level of `helpers/skill_runner.... -> still blocking
  • [item-3] Same-PR follow-up from Google Gemini, round 1: Move import shutil to the top of helpers/skill_runner.py for consistency with other imports. -> same-pr
  • [item-4] Same-PR follow-up from Google Gemini, round 1: In cmd_retry_validate, the dry_run logic manifest.get("dry_run", False) or args.dry_run mak... -> same-pr
  • [item-5] Same-PR follow-up from Google Gemini, round 1: Add a test case for retry-validate where the reviewer state is blocking to verify the ID mint... -> same-pr
  • [item-6] Blocking issue from OpenAI Codex, round 1: helpers/skill_runner.py:684 catches every SystemExit from _complete_reviewer_turn, but only valid... -> resolved

-- Google Gemini

@wwind123

Copy link
Copy Markdown
Owner Author

Review verdict: Blocking

The interactive repair flow still has two correctness blockers and several carried same-PR cleanup/test gaps.

Blocking issues

  • helpers/skill_runner.py:63 defines _REPAIR_BASE with tempfile.gettempdir(), but the PR diff does not add an import for tempfile. That makes helpers.skill_runner fail at module import time before any subcommand can run.
  • helpers/skill_runner.py:558 still assigns reported_blocking = new_unresolved_texts when a parsed review is blocking but has no blocking_items, yet new_unresolved_texts is not defined in the extracted _complete_reviewer_turn path. A valid blocking review in that shape will crash with NameError instead of completing validation/metadata attachment.

Same-PR follow-ups

  • Move the new shutil import to the top-level import block in helpers/skill_runner.py for consistency with the surrounding module imports.
  • Revisit retry-validate dry-run handling: dry_run = manifest.get("dry_run", False) or args.dry_run prevents a repair directory created from a dry run from being promoted to a real posting run without manually editing manifest.json.
  • Add retry-validate test coverage for a blocking reviewer response so the repair flow exercises ID minting, metadata attachment, and returned blocking_items for the non-approved path.

Prior unresolved item dispositions

  • [item-1] Blocking issue from Google Gemini, round 1: new_unresolved_texts is undefined in _complete_reviewer_turn. This will cause a NameError i... -> still blocking
  • [item-2] Blocking issue from Google Gemini, round 1: The tempfile module is used to define _REPAIR_BASE at the top level of `helpers/skill_runner.... -> still blocking
  • [item-3] Same-PR follow-up from Google Gemini, round 1: Move import shutil to the top of helpers/skill_runner.py for consistency with other imports. -> same-pr
  • [item-4] Same-PR follow-up from Google Gemini, round 1: In cmd_retry_validate, the dry_run logic manifest.get("dry_run", False) or args.dry_run mak... -> same-pr
  • [item-5] Same-PR follow-up from Google Gemini, round 1: Add a test case for retry-validate where the reviewer state is blocking to verify the ID mint... -> same-pr
  • [item-6] Blocking issue from OpenAI Codex, round 1: helpers/skill_runner.py:684 catches every SystemExit from _complete_reviewer_turn, but only valid... -> resolved

-- OpenAI Codex

@wwind123

Copy link
Copy Markdown
Owner Author

Round 2 response — all blocking items are false positives

Items about blocking_texts/new_unresolved_texts being undefined: FALSE POSITIVE

Both variables are defined inside _complete_reviewer_turn at lines 550–553 of the current branch:

blocking_key = "blocking_plan_issues" if flow == "plan" else "blocking_items"
blocking_texts = list(review_json.get(blocking_key, []))
same_key = "same_plan_followups" if flow == "plan" else "same_pr_followups"
new_unresolved_texts = list(review_json.get(same_key, []))

These lines were present in the original _run_reviewer and were moved into _complete_reviewer_turn intact. Verification:

$ python -c "from helpers.skill_runner import _complete_reviewer_turn; import inspect; src = inspect.getsource(_complete_reviewer_turn); assert 'blocking_texts' in src; assert 'new_unresolved_texts' in src; print('both defined')"
both defined

Items about tempfile not imported: FALSE POSITIVE

import tempfile is at line 42 of the file (in the standard-library imports block, unchanged from before this PR). The module loads without error:

$ python -c "from helpers.skill_runner import _REPAIR_BASE; print(_REPAIR_BASE)"
/tmp/coding-review-agent-loop/repair

All 20 tests pass on this branch:

20 passed in 12.18s

No code changes this round — the implementation is correct.

-- Anthropic Claude

…un override; add blocking test

- Move `import shutil` from inside _save_raw_to_repair_dir to the
  top-level import block for consistency.
- Change dry_run in cmd_retry_validate from
  `manifest.get("dry_run") or args.dry_run` to `args.dry_run` so the
  user's explicit flag controls posting at repair time, and a
  dry-run repair dir can be promoted to a real post without editing
  manifest.json.
- Add test_retry_validate_blocking_response_returns_blocking_items
  covering ID minting and metadata attachment in the blocking path.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@wwind123

Copy link
Copy Markdown
Owner Author

Changes since round 2

False positives (no code change needed):

  • blocking_texts and new_unresolved_texts are defined at lines 550–553 of _complete_reviewer_turn. All 21 tests pass and the module imports cleanly.
  • import tempfile is at line 42 (unchanged since before this PR). python -c "from helpers.skill_runner import _REPAIR_BASE; print(_REPAIR_BASE)" produces /tmp/coding-review-agent-loop/repair with no NameError.

Valid feedback addressed:

  1. Moved import shutil from inside _save_raw_to_repair_dir to the top-level import block.
  2. Changed dry_run in cmd_retry_validate from manifest.get("dry_run", False) or args.dry_run to args.dry_run — the user's explicit flag now controls posting at repair time, allowing a dry-run repair dir to be promoted to a real post by omitting --dry-run.
  3. Added test_retry_validate_blocking_response_returns_blocking_items: creates a repair dir with a blocking raw response, runs retry-validate --dry-run, asserts state=blocking, blocking_items non-empty, and item_id correctly minted.

21 tests passing.

-- Anthropic Claude

@wwind123

Copy link
Copy Markdown
Owner Author

Review verdict: Blocking

The PR successfully implements the interactive repair flow and addresses most prior review items, but it introduces a NameError in the extracted _complete_reviewer_turn function due to an undefined variable.

Blocking issues

  • new_unresolved_texts is undefined in _complete_reviewer_turn. This will cause a NameError if a reviewer returns a blocking state without any items in the blocking_items (or blocking_plan_issues) field. This path is currently untested and will fail at runtime.
  • The variable reported_blocking in _complete_reviewer_turn is assigned but never used. The function's return value and the minted IDs only depend on blocking_texts. This logic should be cleaned up or corrected to actually report unresolved items when blocking.

Prior unresolved item dispositions

  • [item-1] Blocking issue from Google Gemini, round 1: new_unresolved_texts is undefined in _complete_reviewer_turn. This will cause a NameError i... -> still blocking
  • [item-2] Blocking issue from Google Gemini, round 1: The tempfile module is used to define _REPAIR_BASE at the top level of `helpers/skill_runner.... -> resolved
  • [item-3] Same-PR follow-up from Google Gemini, round 1: Move import shutil to the top of helpers/skill_runner.py for consistency with other imports. -> resolved
  • [item-4] Same-PR follow-up from Google Gemini, round 1: In cmd_retry_validate, the dry_run logic manifest.get("dry_run", False) or args.dry_run mak... -> resolved
  • [item-5] Same-PR follow-up from Google Gemini, round 1: Add a test case for retry-validate where the reviewer state is blocking to verify the ID mint... -> resolved
  • [item-7] Blocking issue from Google Gemini, round 1: In _complete_reviewer_turn, the variables blocking_texts and new_unresolved_texts (and subs... -> still blocking
  • [item-8] Blocking issue from Google Gemini, round 1: The tempfile module is used to define _REPAIR_BASE at the top level of `helpers/skill_runner.... -> resolved
  • [item-9] Same-PR follow-up from Google Gemini, round 1: Move import shutil to the top of helpers/skill_runner.py for consistency with other imports. -> resolved
  • [item-10] Same-PR follow-up from Google Gemini, round 1: In cmd_retry_validate, the dry_run logic manifest.get("dry_run", False) or args.dry_run pre... -> resolved
  • [item-11] Same-PR follow-up from Google Gemini, round 1: Add a test case for retry-validate where the reviewer state is blocking to verify ID minting... -> resolved
  • [item-12] Blocking issue from OpenAI Codex, round 1: helpers/skill_runner.py:63 defines _REPAIR_BASE with tempfile.gettempdir(), but the PR diff does... -> resolved
  • [item-13] Blocking issue from OpenAI Codex, round 1: helpers/skill_runner.py:558 still assigns reported_blocking = new_unresolved_texts when a parsed... -> still blocking
  • [item-14] Same-PR follow-up from OpenAI Codex, round 1: Move the new shutil import to the top-level import block in helpers/skill_runner.py for consisten... -> resolved
  • [item-15] Same-PR follow-up from OpenAI Codex, round 1: Revisit retry-validate dry-run handling: dry_run = manifest.get("dry_run", False) or args.dry_run... -> resolved
  • [item-16] Same-PR follow-up from OpenAI Codex, round 1: Add retry-validate test coverage for a blocking reviewer response so the repair flow exercises ID... -> resolved

-- Google Gemini

@wwind123

Copy link
Copy Markdown
Owner Author

Review verdict: Approved

Approved. The current PR head resolves the carried repair-flow issues; imports, dry-run override behavior, validation-only retry handling, and blocking retry coverage are in place, and the GitHub test check is passing.

Prior unresolved item dispositions

  • [item-1] Blocking issue from Google Gemini, round 1: new_unresolved_texts is undefined in _complete_reviewer_turn. This will cause a NameError i... -> resolved
  • [item-2] Blocking issue from Google Gemini, round 1: The tempfile module is used to define _REPAIR_BASE at the top level of `helpers/skill_runner.... -> resolved
  • [item-3] Same-PR follow-up from Google Gemini, round 1: Move import shutil to the top of helpers/skill_runner.py for consistency with other imports. -> resolved
  • [item-4] Same-PR follow-up from Google Gemini, round 1: In cmd_retry_validate, the dry_run logic manifest.get("dry_run", False) or args.dry_run mak... -> resolved
  • [item-5] Same-PR follow-up from Google Gemini, round 1: Add a test case for retry-validate where the reviewer state is blocking to verify the ID mint... -> resolved
  • [item-7] Blocking issue from Google Gemini, round 1: In _complete_reviewer_turn, the variables blocking_texts and new_unresolved_texts (and subs... -> resolved
  • [item-8] Blocking issue from Google Gemini, round 1: The tempfile module is used to define _REPAIR_BASE at the top level of `helpers/skill_runner.... -> resolved
  • [item-9] Same-PR follow-up from Google Gemini, round 1: Move import shutil to the top of helpers/skill_runner.py for consistency with other imports. -> resolved
  • [item-10] Same-PR follow-up from Google Gemini, round 1: In cmd_retry_validate, the dry_run logic manifest.get("dry_run", False) or args.dry_run pre... -> resolved
  • [item-11] Same-PR follow-up from Google Gemini, round 1: Add a test case for retry-validate where the reviewer state is blocking to verify ID minting... -> resolved
  • [item-12] Blocking issue from OpenAI Codex, round 1: helpers/skill_runner.py:63 defines _REPAIR_BASE with tempfile.gettempdir(), but the PR diff does... -> resolved
  • [item-13] Blocking issue from OpenAI Codex, round 1: helpers/skill_runner.py:558 still assigns reported_blocking = new_unresolved_texts when a parsed... -> resolved
  • [item-14] Same-PR follow-up from OpenAI Codex, round 1: Move the new shutil import to the top-level import block in helpers/skill_runner.py for consisten... -> resolved
  • [item-15] Same-PR follow-up from OpenAI Codex, round 1: Revisit retry-validate dry-run handling: dry_run = manifest.get("dry_run", False) or args.dry_run... -> resolved
  • [item-16] Same-PR follow-up from OpenAI Codex, round 1: Add retry-validate test coverage for a blocking reviewer response so the repair flow exercises ID... -> resolved

-- OpenAI Codex

@wwind123

Copy link
Copy Markdown
Owner Author

Round 3 response — Gemini's items are false positives (Codex approved)

new_unresolved_texts undefined — FALSE POSITIVE (line 553):

# Line 550-559 of helpers/skill_runner.py in _complete_reviewer_turn:
blocking_key = "blocking_plan_issues" if flow == "plan" else "blocking_items"
blocking_texts = list(review_json.get(blocking_key, []))
same_key = "same_plan_followups" if flow == "plan" else "same_pr_followups"
new_unresolved_texts = list(review_json.get(same_key, []))   # ← DEFINED HERE

reported_blocking = blocking_texts
if parsed_state == "blocking" and not blocking_texts:
    reported_blocking = new_unresolved_texts   # ← USED HERE

new_unresolved_texts is also used at line 577 (for text in new_unresolved_texts:) to mint same-pr/same-plan items.

reported_blocking unused — FALSE POSITIVE (line 628):

# Return value at line 625-629:
return {
    "reviewer_name": agent_cap,
    "state": parsed_state,
    "blocking_items": [{"text": t} for t in reported_blocking],   # ← USED HERE
    "new_items": new_items_serialized,
}

Test coverage for the blocking path:

test_retry_validate_blocking_response_returns_blocking_items (added in round 2) creates a repair dir with a blocking reviewer response, runs retry-validate --dry-run, and asserts state=blocking with items — this would NameError if new_unresolved_texts were truly undefined.

21 tests passing. No code changes this round.

-- Anthropic Claude

…rror refactor

Raises _ValidationError on validation failure, not sys.exit(1).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@wwind123

Copy link
Copy Markdown
Owner Author

Changes since round 3

Fixed a stale docstring: _complete_reviewer_turn raises _ValidationError on validation failure (not sys.exit(1) as the old docstring said).

Note: Gemini's round-3 blocking items (new_unresolved_texts undefined, reported_blocking unused) were false positives caused by Gemini's checkout directory being empty (no source files to read). The checkout has been cleared and will be re-cloned for this round. Both variables are defined and used — verified by the passing test_retry_validate_blocking_response_returns_blocking_items test which exercises the exact code path.

-- Anthropic Claude

@wwind123

Copy link
Copy Markdown
Owner Author

Review verdict: Blocking

The PR introduces a useful interactive repair feature and improves normalization logic, but it contains a critical NameError in the extracted reviewer turn logic. Several prior blocking items identifying this issue remain unresolved.

Blocking issues

  • NameError in _complete_reviewer_turn: The variable new_unresolved_texts is undefined. This will cause a runtime crash if a reviewer returns a blocking state without any items in the expected blocking_items or blocking_plan_issues fields. This addresses prior items [item-1], [item-7], [item-13], and [item-17].
  • Unused variable reported_blocking in _complete_reviewer_turn: The variable is assigned but never used, and its logic (which depends on the undefined new_unresolved_texts) appears to be a broken carryover from the extraction process. This addresses prior item [item-18].

Prior unresolved item dispositions

  • [item-1] Blocking issue from Google Gemini, round 1: new_unresolved_texts is undefined in _complete_reviewer_turn. This will cause a NameError i... -> still blocking
  • [item-7] Blocking issue from Google Gemini, round 1: In _complete_reviewer_turn, the variables blocking_texts and new_unresolved_texts (and subs... -> still blocking
  • [item-13] Blocking issue from OpenAI Codex, round 1: helpers/skill_runner.py:558 still assigns reported_blocking = new_unresolved_texts when a parsed... -> still blocking
  • [item-17] Blocking issue from Google Gemini, round 1: new_unresolved_texts is undefined in _complete_reviewer_turn. This will cause a NameError i... -> still blocking
  • [item-18] Blocking issue from Google Gemini, round 1: The variable reported_blocking in _complete_reviewer_turn is assigned but never used. The fun... -> still blocking

-- Google Gemini

@wwind123

Copy link
Copy Markdown
Owner Author

Review verdict: Blocking

The interactive repair extraction still leaves the blocking/no-items path broken: _complete_reviewer_turn can crash with NameError instead of completing validation or producing a retryable review result.

Blocking issues

  • helpers/skill_runner.py:558: _complete_reviewer_turn still assigns reported_blocking = new_unresolved_texts when parsed_state is blocking and no blocking_texts were parsed, but new_unresolved_texts is never defined in this extracted function. A syntactically valid blocking response with no blocking_items/blocking_plan_issues will crash with NameError after validation instead of returning a controlled validation/posting result; the new blocking test only covers the non-empty blocking_items path, so this remains untested.

Prior unresolved item dispositions

  • [item-1] Blocking issue from Google Gemini, round 1: new_unresolved_texts is undefined in _complete_reviewer_turn. This will cause a NameError i... -> still blocking
  • [item-7] Blocking issue from Google Gemini, round 1: In _complete_reviewer_turn, the variables blocking_texts and new_unresolved_texts (and subs... -> still blocking
  • [item-13] Blocking issue from OpenAI Codex, round 1: helpers/skill_runner.py:558 still assigns reported_blocking = new_unresolved_texts when a parsed... -> still blocking
  • [item-17] Blocking issue from Google Gemini, round 1: new_unresolved_texts is undefined in _complete_reviewer_turn. This will cause a NameError i... -> still blocking
  • [item-18] Blocking issue from Google Gemini, round 1: The variable reported_blocking in _complete_reviewer_turn is assigned but never used. The fun... -> still blocking

-- OpenAI Codex

…eviewer_turn

Add explicit list[str] annotations to blocking_texts, new_unresolved_texts,
and reported_blocking so these definitions appear as unambiguous + lines in
the diff, making their location inside _complete_reviewer_turn clear to
reviewers.

Related to #291.
@wwind123

Copy link
Copy Markdown
Owner Author

Change summary (commit 4a9f0f8)

Fix for Gemini false positives about undefined variables

Previous rounds, Gemini repeatedly blocked claiming new_unresolved_texts and reported_blocking were undefined in _complete_reviewer_turn. These variables are defined there (lines 551–557 in the new file), but the git diff hunk header read:

@@ -495,11 +558,9 @@ def _run_reviewer(

The @@ ... def _run_reviewer( context marker comes from where those lines used to be in the old file. Gemini interpreted this as "these lines are inside _run_reviewer in the new file" and, since new_unresolved_texts was removed from _run_reviewer during the refactor, concluded NameError.

The fix: Added explicit list[str] type annotations to all three variable definitions:

blocking_texts: list[str] = list(review_json.get(blocking_key, []))
...
new_unresolved_texts: list[str] = list(review_json.get(same_key, []))
reported_blocking: list[str] = blocking_texts

These lines now appear as explicit + lines in the diff, unambiguously inside _complete_reviewer_turn. No logic changed.

-- Anthropic Claude

@wwind123

Copy link
Copy Markdown
Owner Author

Review verdict: Blocking

The PR adds a useful interactive repair flow for failed reviewer validation and correctly extracts the core turn logic into _complete_reviewer_turn. While the NameError risks for new_unresolved_texts and blocking_texts identified in earlier rounds are resolved by their new definitions, the functional logic for surfacing blocking followups remains broken. Specifically, reported_blocking is assigned but never used in the _complete_reviewer_turn return dictionary, which will lead to empty blocking reports when a reviewer blocks using only same-round followups.

Blocking issues

  • In _complete_reviewer_turn, reported_blocking is correctly calculated but never used in the return dictionary. The function returns "blocking_items": blocking_texts, which will be empty if a reviewer blocks using only same-round followups (e.g., same_pr_followups). It should return reported_blocking instead. This addresses [item-18] and [item-20].
  • Missing test coverage for the scenario where a reviewer returns a blocking state with an empty blocking_items list but non-empty same_pr_followups. The current tests only cover the case where blocking_items (or blocking_plan_issues) is non-empty. This addresses [item-21].

Prior unresolved item dispositions

  • [item-1] Blocking issue from Google Gemini, round 1: new_unresolved_texts is undefined in _complete_reviewer_turn. This will cause a NameError i... -> resolved
  • [item-7] Blocking issue from Google Gemini, round 1: In _complete_reviewer_turn, the variables blocking_texts and new_unresolved_texts (and subs... -> resolved
  • [item-13] Blocking issue from OpenAI Codex, round 1: helpers/skill_runner.py:558 still assigns reported_blocking = new_unresolved_texts when a parsed... -> resolved
  • [item-17] Blocking issue from Google Gemini, round 1: new_unresolved_texts is undefined in _complete_reviewer_turn. This will cause a NameError i... -> resolved
  • [item-18] Blocking issue from Google Gemini, round 1: The variable reported_blocking in _complete_reviewer_turn is assigned but never used. The fun... -> still blocking
  • [item-19] Blocking issue from Google Gemini, round 1: NameError in _complete_reviewer_turn: The variable new_unresolved_texts is undefined. This wi... -> resolved
  • [item-20] Blocking issue from Google Gemini, round 1: Unused variable reported_blocking in _complete_reviewer_turn: The variable is assigned but ne... -> still blocking
  • [item-21] Blocking issue from OpenAI Codex, round 1: helpers/skill_runner.py:558: _complete_reviewer_turn still assigns reported_blocking = new_unreso... -> still blocking

-- Google Gemini

@wwind123

Copy link
Copy Markdown
Owner Author

Review verdict: Blocking

The prior extracted-turn NameError is fixed, but validation-repair dry runs can still produce an unsafe retry command that posts comments when followed exactly.

Blocking issues

  • helpers/skill_runner.py:693: When the original reviewer run used --dry-run and validation fails, the printed retry command omits --dry-run even though the manifest records dry_run=true. Following the advertised exact command will re-run validation in posting mode and can create a real GitHub comment from what was intended to be a dry-run repair flow. Preserve dry-run in the generated command, or default retry-validate to the manifest dry_run value unless explicitly overridden.

Prior unresolved item dispositions

  • [item-1] Blocking issue from Google Gemini, round 1: new_unresolved_texts is undefined in _complete_reviewer_turn. This will cause a NameError i... -> resolved
  • [item-7] Blocking issue from Google Gemini, round 1: In _complete_reviewer_turn, the variables blocking_texts and new_unresolved_texts (and subs... -> resolved
  • [item-13] Blocking issue from OpenAI Codex, round 1: helpers/skill_runner.py:558 still assigns reported_blocking = new_unresolved_texts when a parsed... -> resolved
  • [item-17] Blocking issue from Google Gemini, round 1: new_unresolved_texts is undefined in _complete_reviewer_turn. This will cause a NameError i... -> resolved
  • [item-18] Blocking issue from Google Gemini, round 1: The variable reported_blocking in _complete_reviewer_turn is assigned but never used. The fun... -> resolved
  • [item-19] Blocking issue from Google Gemini, round 1: NameError in _complete_reviewer_turn: The variable new_unresolved_texts is undefined. This wi... -> resolved
  • [item-20] Blocking issue from Google Gemini, round 1: Unused variable reported_blocking in _complete_reviewer_turn: The variable is assigned but ne... -> resolved
  • [item-21] Blocking issue from OpenAI Codex, round 1: helpers/skill_runner.py:558: _complete_reviewer_turn still assigns reported_blocking = new_unreso... -> resolved

-- OpenAI Codex

…ollowups blocking path

- When _run_reviewer fails validation on a dry-run, the printed retry-validate
  command now includes --dry-run so users don't accidentally post a real comment
  when following the hint.
- Add test_retry_validate_blocking_via_same_followups_only: exercises the
  reported_blocking = new_unresolved_texts code path (blocking state with
  empty blocking_plan_issues but non-empty same_plan_followups).

Related to #291.
@wwind123

Copy link
Copy Markdown
Owner Author

Change summary (commit 77be6d6)

Addressing round 5 feedback:

Codex (real bug fixed): When _run_reviewer was called with --dry-run and validation failed, the printed retry command omitted --dry-run. Following that hint would post a real GitHub comment from what was intended as a dry-run repair flow. Fixed by appending --dry-run to the hint command when dry_run=True.

Gemini item 1 (false positive): Claimed reported_blocking is never used in the return dictionary. It is — line 628 of _complete_reviewer_turn returns "blocking_items": [{"text": t} for t in reported_blocking]. The function does NOT return blocking_texts directly.

Gemini item 2 (valid gap fixed): Added test_retry_validate_blocking_via_same_followups_only — exercises the reported_blocking = new_unresolved_texts code path, i.e., a blocking state where blocking_plan_issues is empty but same_plan_followups is non-empty. Verifies the followup text surfaces as blocking_items[0]["text"].

-- Anthropic Claude

@wwind123

Copy link
Copy Markdown
Owner Author

Review verdict: Approved

The PR successfully implements the interactive repair flow for reviewer validation failures, addressing all prior blocking items. The extraction of _complete_reviewer_turn centralizes the validation and posting pipeline, making it reusable by the new retry-validate command. Robustness improvements to JSON normalization and clear error reporting significantly enhance the developer experience when dealing with non-compliant agent responses.

Prior unresolved item dispositions

  • [item-18] Blocking issue from Google Gemini, round 1: The variable reported_blocking in _complete_reviewer_turn is assigned but never used. The fun... -> resolved
  • [item-20] Blocking issue from Google Gemini, round 1: Unused variable reported_blocking in _complete_reviewer_turn: The variable is assigned but ne... -> resolved
  • [item-21] Blocking issue from OpenAI Codex, round 1: helpers/skill_runner.py:558: _complete_reviewer_turn still assigns reported_blocking = new_unreso... -> resolved
  • [item-22] Blocking issue from Google Gemini, round 1: In _complete_reviewer_turn, reported_blocking is correctly calculated but never used in the r... -> resolved
  • [item-23] Blocking issue from Google Gemini, round 1: Missing test coverage for the scenario where a reviewer returns a blocking state with an empty... -> resolved
  • [item-24] Blocking issue from OpenAI Codex, round 1: helpers/skill_runner.py:693: When the original reviewer run used --dry-run and validation fails,... -> resolved

-- Google Gemini

@wwind123

Copy link
Copy Markdown
Owner Author

Review verdict: Approved

The interactive repair flow changes address the carried validation and dry-run issues in the provided diff.

Prior unresolved item dispositions

  • [item-18] Blocking issue from Google Gemini, round 1: The variable reported_blocking in _complete_reviewer_turn is assigned but never used. The fun... -> resolved
  • [item-20] Blocking issue from Google Gemini, round 1: Unused variable reported_blocking in _complete_reviewer_turn: The variable is assigned but ne... -> resolved
  • [item-21] Blocking issue from OpenAI Codex, round 1: helpers/skill_runner.py:558: _complete_reviewer_turn still assigns reported_blocking = new_unreso... -> resolved
  • [item-22] Blocking issue from Google Gemini, round 1: In _complete_reviewer_turn, reported_blocking is correctly calculated but never used in the r... -> resolved
  • [item-23] Blocking issue from Google Gemini, round 1: Missing test coverage for the scenario where a reviewer returns a blocking state with an empty... -> resolved
  • [item-24] Blocking issue from OpenAI Codex, round 1: helpers/skill_runner.py:693: When the original reviewer run used --dry-run and validation fails,... -> resolved

-- OpenAI Codex

@wwind123 wwind123 merged commit 9822170 into main Jun 12, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

skill: support interactive repair when reviewer validation fails

1 participant