Skip to content

Add wingman automated code review system#302

Draft
dipree wants to merge 35 commits intomainfrom
dipree/entire-wingman
Draft

Add wingman automated code review system#302
dipree wants to merge 35 commits intomainfrom
dipree/entire-wingman

Conversation

@dipree
Copy link
Contributor

@dipree dipree commented Feb 12, 2026

Summary

Adds wingman, an automated code review system that reviews agent-produced code changes after each commit and delivers actionable suggestions back to the agent. The system prioritizes visible delivery — the user sees the agent reading and applying the review in their terminal.

How It Works

Review Generation

After each commit (or auto-commit checkpoint), wingman spawns a detached background process that:

  1. Waits 10 seconds for the agent turn to settle
  2. Computes a branch-level diff against main/master merge-base (holistic view, not just one commit)
  3. Loads session context (prompts, commit message, checkpoint metadata)
  4. Calls Claude (sonnet) with read-only tool access (Read, Glob, Grep) for an intent-aware review
  5. Writes suggestions to .entire/REVIEW.md

Review Delivery (Two Paths)

Primary — Visible injection (when a live session exists):

Commit → wingman review → REVIEW.md written → live session detected → defer

User sends next prompt → UserPromptSubmit hook
  → REVIEW.md exists → inject as additionalContext
  → Agent reads REVIEW.md, applies suggestions, deletes file
  → User sees everything in their terminal
  → Agent then proceeds with user's actual request

The additionalContext hook response field adds the instruction directly into Claude's context as a mandatory pre-step (stronger than systemMessage which the agent can deprioritize).

Fallback — Background auto-apply (when no sessions are alive):

Review finishes → no live sessions → claude --continue --print
  → Applies review silently in background

This only fires when all sessions are ENDED (user closed everything). Three trigger points cover all scenarios:

  • Review process itself (review finishes, no live sessions)
  • SessionEnd hook (last session closes with pending review)
  • Stop hook (turn ends with no live sessions — edge case)

Decision Flow

              REVIEW.md written
                    │
                    ▼
          ┌─────────────────┐
          │ Any live session │
          │   exists?        │
          └────────┬────────┘
              │          │
             Yes         No
              │          │
              ▼          ▼
        Defer to     Background
        next prompt  auto-apply
        (visible)    (invisible)

Key Behaviors

Stale Review Cleanup

  • REVIEW.md from a different session → deleted (stale)
  • REVIEW.md without state file → deleted (orphan)
  • REVIEW.md older than 1 hour → deleted (TTL expired)
  • REVIEW.md from current session + fresh → kept (skip new review)

Concurrency & Safety

  • Atomic lock file (O_CREATE|O_EXCL) prevents concurrent review spawns; stale locks (>30m) auto-cleaned
  • File hash dedup prevents re-reviewing identical change sets within the same session
  • ENTIRE_WINGMAN_APPLY=1 env var prevents post-commit hook from triggering another review during auto-apply (recursion prevention)
  • --setting-sources "" disables hooks on review/apply subprocesses
  • GIT_ env stripping* prevents git index corruption in subprocesses
  • ApplyAttemptedAt field prevents infinite auto-apply retry loops

Session Detection

  • hasAnyLiveSession() scans .git/entire-sessions/ for any non-ENDED session
  • isSessionIdle() reads session state files directly using repoRoot (not git rev-parse, which fails in detached processes running from cwd=/)
  • Handles both normal repos and git worktrees (parses .git file for gitdir: pointer)

Strategy Integration

  • Manual-commit: Review triggered from git post-commit hook via triggerWingmanFromCommit()
  • Auto-commit: Review triggered from stop hook after SaveChanges() creates the commit

Files

File Purpose
wingman.go State management, trigger logic, dedup, lock files, stale cleanup, CLI commands
wingman_review.go Detached review subprocess, diff computation, Claude API call, session detection, auto-apply
wingman_prompt.go Review prompt construction from diff + context
wingman_instruction.md Embedded instruction injected into agent context
wingman_spawn_unix.go Detached process spawning (Unix)
wingman_spawn_other.go No-op stubs (non-Unix)
wingman_test.go 25 tests covering state, dedup, stale cleanup, session detection
hooks_claudecode_handlers.go Prompt-submit injection, stop hook trigger, session-end trigger
hooks_geminicli_handlers.go Session-end trigger for Gemini CLI
hooks_git_cmd.go Post-commit hook trigger for manual-commit strategy
hooks.go additionalContext hook response field, outputHookResponseWithContext()
settings/settings.go IsWingmanEnabled() setting
docs/architecture/wingman.md Comprehensive architecture documentation

Configuration

entire wingman enable   # Enable
entire wingman disable  # Disable + cleanup
entire wingman status   # Show status

Test plan

  • mise run fmt && mise run lint && mise run test:ci passes
  • Enable wingman in a test repo, make agent produce changes, verify REVIEW.md is generated
  • Verify prompt-submit injection fires on next user prompt (visible in terminal)
  • Verify agent addresses REVIEW.md before user's request
  • Verify background auto-apply only fires when no live sessions exist
  • Start new session with stale REVIEW.md from old session — verify cleanup
  • Verify ApplyAttemptedAt prevents infinite retry loops
  • Verify concurrent review spawns are blocked by lock file

🤖 Generated with Claude Code

dipree and others added 15 commits February 11, 2026 16:59
Adds `entire wingman` command group (enable/disable/status) and a
background review loop that analyzes agent code changes via Claude,
writes suggestions to .entire/REVIEW.md, and auto-applies them when
the session is idle. Includes lock file to prevent concurrent spawns
and base commit capture for deterministic diffs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Entire-Checkpoint: 71d4670c34ac
Adds visibility into the detached wingman subprocess which previously
discarded all output. Stderr is now redirected to .entire/logs/wingman.log
with timestamped step-by-step logging, and the parent trigger uses the
structured logging package for trigger-side events.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Entire-Checkpoint: 778aa463e2ba
…prevention

- Move wingman trigger from stop hook to git post-commit hook for
  manual-commit strategy (auto-commit still triggers from stop hook)
- Add ENTIRE_WINGMAN_APPLY env var to prevent infinite hook recursion
- Rewrite review prompt to be intent-aware using checkpoint data
  (prompts, commit message, session context, checkpoint file paths)
- Give reviewer read-only repo access (--allowedTools Read,Glob,Grep)
- Give auto-apply agent edit permissions (--permission-mode acceptEdits)
- Add session context reading from .entire/metadata/<session>/context.md
- Add Agent field to settings parser for playground compatibility

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Entire-Checkpoint: 3110b32e99ff
When a pending REVIEW.md exists and the user submits a prompt, inject
a systemMessage via the hook response so the agent automatically reads
and applies the review suggestions before proceeding with the user's
request. Previously this was only a stderr notification to the user.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Entire-Checkpoint: 652d24e097d2
Both the prompt-submit hook injection and the auto-apply prompt now
use the same instruction from wingman_apply.md via go:embed, replacing
two divergent hardcoded strings with a single source of truth.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Entire-Checkpoint: 01916e5b6e76
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Entire-Checkpoint: dabe38b90045
Diff against the merge base with main/master so the reviewer sees all
branch changes holistically, not just the latest commit in isolation.
Falls back to HEAD diff if no merge base is found (e.g., on main itself).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Entire-Checkpoint: 7c8a0f30fbef
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Entire-Checkpoint: 05a188c59540
The wingman review was never being applied because: (1) stale REVIEW.md
from previous sessions blocked new reviews indefinitely, and (2) the
systemMessage injection on prompt-submit was unreliable since the agent
deprioritized it vs the user's actual request.

This adds stop-hook-based auto-apply as the primary delivery mechanism.
When an agent turn ends (ACTIVE → IDLE), the stop hook checks for pending
REVIEW.md and spawns a detached `entire wingman __apply` process that
triggers `claude --continue`. Also adds session-aware stale review cleanup
(different session, orphaned state, >1hr TTL) and retry prevention via
ApplyAttemptedAt field.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Entire-Checkpoint: d4cbcc6d0339
The auto-apply trigger was only in the code path where the turn had file
changes. When the agent turn ended without modifications (e.g. answering
a question), commitWithMetadata returned early and never reached the
auto-apply check. Extract triggerWingmanAutoApplyIfPending helper and
call it from both the no-changes early return and the main path.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Entire-Checkpoint: 2695fc81fb3d
Add logging to entire.log for key wingman events: review in progress
indicator on prompt-submit, auto-apply spawn, instruction injection, and
review-already-running detection. This gives better visibility into the
wingman review lifecycle beyond just wingman.log in the detached process.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Entire-Checkpoint: 63095cc2fdae
Two bugs prevented auto-apply from triggering:

1. The detached review/apply processes run with cwd=/ so git commands
   in strategy.LoadSessionState (git rev-parse --git-common-dir) failed,
   causing isSessionIdle to always return false. Fix: read session state
   files directly using repoRoot to locate .git/entire-sessions/.

2. The session ID in the wingman payload could be from an ended session
   (user closed and reopened Claude). isSessionIdle only checked that
   specific session. Fix: fall back to checking ALL sessions, so if any
   session is idle the review gets applied.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Entire-Checkpoint: 9f7e019204ae
…ressed first

Previously the review instruction was injected as systemMessage (a warning)
which the agent could deprioritize vs the user's direct request. Now uses
additionalContext which is added to Claude's context as mandatory
instructions.

Also strengthened the instruction text to explicitly require the review
be addressed BEFORE the user's request, and simplified the injection
logic — inject whenever REVIEW.md exists regardless of session ID or
apply-attempted state (the file's existence is the source of truth).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Entire-Checkpoint: 45a53baac1f3
When a live session exists, defer wingman review delivery to the
prompt-submit injection (visible in terminal) instead of running a
background auto-apply that the user can't see. Background auto-apply
is now only used when no sessions are alive. Also triggers auto-apply
from the SessionEnd hook when the last session closes with a pending
review. Adds comprehensive wingman lifecycle documentation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Entire-Checkpoint: f25c3e798f10
Copilot AI review requested due to automatic review settings February 12, 2026 13:15
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Entire-Checkpoint: 6e7f40877703
The review prompt is the core value proposition: it leverages Entire's
checkpoint data (prompts, session context, commit message) to enable
intent-aware review that catches misalignment between what was asked
and what was built — not just code bugs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Entire-Checkpoint: 90fe9697fc50
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds wingman, an automated code review system that reviews agent-produced code changes after each commit and delivers actionable suggestions back to the agent. The system prioritizes visible delivery where users see the agent reading and applying reviews in their terminal.

Changes:

  • Adds wingman review system with detached background processes for reviewing code changes using Claude
  • Implements dual-delivery mechanism: visible injection via prompt hooks (primary) and background auto-apply (fallback)
  • Includes comprehensive state management with deduplication, stale review cleanup, and retry prevention
  • Adds CLI commands for enabling/disabling/status checking of wingman functionality
  • Integrates with existing hook system for Claude Code and Gemini CLI agents

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 13 comments.

Show a summary per file
File Description
docs/architecture/wingman.md Comprehensive architecture documentation covering components, lifecycle, and configuration
cmd/entire/cli/wingman.go Core state management, trigger logic, deduplication, lock files, and CLI commands
cmd/entire/cli/wingman_review.go Detached review subprocess implementation with diff computation, Claude API calls, and session detection
cmd/entire/cli/wingman_prompt.go Review prompt construction from diff and session context with truncation for large diffs
cmd/entire/cli/wingman_instruction.md Embedded instruction injected into agent context for applying reviews
cmd/entire/cli/wingman_spawn_unix.go Unix-specific detached subprocess spawning with process group detachment
cmd/entire/cli/wingman_spawn_other.go No-op stubs for non-Unix platforms
cmd/entire/cli/wingman_test.go Unit tests covering state management, deduplication, session detection, and stale cleanup
cmd/entire/cli/hooks_claudecode_handlers.go Prompt-submit injection, stop hook trigger, and session-end trigger integration
cmd/entire/cli/hooks_geminicli_handlers.go Session-end trigger for Gemini CLI agent
cmd/entire/cli/hooks_git_cmd.go Post-commit hook trigger for manual-commit strategy
cmd/entire/cli/hooks.go Hook response structure with additionalContext field
cmd/entire/cli/settings/settings.go Wingman enable/disable settings and Agent field addition
cmd/entire/cli/root.go Wingman command registration

dipree and others added 10 commits February 12, 2026 14:36
Allows writing wingman settings to settings.local.json instead of
settings.json, matching the --local pattern used by entire enable.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Entire-Checkpoint: 303bd4b4630f
…Output

Claude Code expects additionalContext nested under hookSpecificOutput, not at
the top level. Also adds a user-visible systemMessage warning when a wingman
review is about to be injected on prompt submit.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Entire-Checkpoint: fc64fce284b3
The format fix in 5fddc49 moved the "Powered by Entire" message from
systemMessage (user-visible) to additionalContext only (agent-only),
causing it to disappear from the terminal. Now sets both fields so the
message is visible to the user and injected into agent context.

Also returns immediately after successful wingman review injection to
prevent subsequent code from corrupting the stdout JSON response, and
removes unnecessary ireturn nolint directives that lint no longer requires.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Entire-Checkpoint: 64267478c135
The auto-apply subprocess aborted when triggered from SessionEnd because
runWingmanApply checked isSessionIdle() which only returned true for
PhaseIdle. After SessionEnd marks the session as PhaseEnded, the check
failed and the subprocess exited with "session became active during
spawn." Replace with a direct Phase.IsActive() check that only blocks on
ACTIVE/ACTIVE_COMMITTED — both IDLE and ENDED are safe to auto-apply.

Also add a 2-hour staleness threshold to hasAnyLiveSession so orphaned
session state files cannot permanently block auto-apply, and improve
diagnostic logging throughout the auto-apply flow (skip reasons, spawned
PIDs, decision points).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Entire-Checkpoint: c4e9f1a814ea
Wingman status messages (review in progress, review pending) previously
only went to stderr, which is invisible in Claude Code's UI. Use the new
outputHookMessage helper to emit systemMessage-only JSON responses so
users can see what wingman is doing in their terminal.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Entire-Checkpoint: 3cbad94bdc0c
Code fixes from PR review:
- Fix hasAnyLiveSession to return true when maxCheck limit hit (safer
  to assume a live session exists than to incorrectly auto-apply)
- Return error when fallback git diff HEAD~1 fails instead of silently
  returning empty string
- Remove unused baseCommit parameter from computeDiff

Documentation updates:
- Add user-visible messages table documenting all systemMessage notifications
- Fix inaccurate claim that detached process strips GIT_* vars (it's the
  Claude CLI calls that strip them via wingmanStripGitEnv)
- Remove redundant Decision Flow diagram and Prompt Structure box
- Consolidate Intent-Aware Review into Review Prompt Construction intro

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Entire-Checkpoint: 64df6ebeea39
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Entire-Checkpoint: bb5c3ec3fd2b
The "Reviewing your changes..." and "Review in progress..." messages
fired on every stop/prompt-submit hook if a lock file existed, even
if it was stale from a crashed review process. Now checks lock file
age against staleLockThreshold (30min) before showing the notification.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Entire-Checkpoint: 6695a23da010
CI's golangci-lint v2.8.0 flags functions returning the Strategy
interface. Add it to the ireturn allow list alongside agent.Agent.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Entire-Checkpoint: 02f99bbe85ed
The 30-minute staleLockThreshold is for lock acquisition (overwriting
stale locks). For notifications, a much tighter 10-minute window is
appropriate since a real review takes at most ~6 minutes
(10s delay + 5min API timeout). This prevents showing "Reviewing your
changes..." for lock files that are clearly stale.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Entire-Checkpoint: f3e92b9f757a
dipree and others added 8 commits February 12, 2026 16:48
- Remove obsolete //nolint:ireturn on GetStrategy (strategy.Strategy is
  now in the ireturn allow list)
- Add comments explaining log file handle inheritance in spawn functions
- Update hasAnyLiveSession tests: rename for clarity, adjust thresholds
  to 5h (beyond staleActiveSessionThreshold of 4h), add test verifying
  stale IDLE sessions are still considered live

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Entire-Checkpoint: 0fc364a22bea
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Entire-Checkpoint: 0fc364a22bea
PostCommit processes all session state files on every commit, which
refreshed both file modtime and LastInteractionTime for sessions stuck
in ACTIVE_COMMITTED phase. This prevented hasAnyLiveSession from
detecting truly stale sessions, blocking background auto-apply.

Two fixes:
- Remove ActionUpdateLastInteraction from ACTIVE_COMMITTED + GitCommit
  self-loop (a commit is not proof the agent is alive)
- Use LastInteractionTime from JSON instead of file modtime for
  staleness checks in hasAnyLiveSession

Also adds readSessionContext warning log for non-ErrNotExist failures.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Entire-Checkpoint: 0fc364a22bea
Rename 'context' parameter to 'additionalContext' in hook response
functions to avoid shadowing the imported context package. Close the
parent's copy of log file descriptors after spawning detached wingman
subprocesses — the child already has its own copy via dup from Start().

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Entire-Checkpoint: 6a54c404f24f
Claude Code sets CLAUDECODE in its environment, which causes nested
claude CLI invocations (wingman review, auto-apply, summarize) to fail
with "cannot be launched inside another Claude Code session". Strip
CLAUDECODE alongside GIT_* vars in wingmanStripGitEnv() and
summarize.stripGitEnv().

Also removes triggerWingmanAutoApplyOnSessionEnd — the session-end
auto-apply path never worked reliably and is unnecessary since pending
reviews are picked up by prompt-submit injection or go stale.

Updates wingman architecture docs with --local flag, status output,
hidden subcommands, missing constants, and Claude CLI invocation details.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Entire-Checkpoint: 81de29e2290e
Add wingman.lock, wingman-state.json, wingman-payload.json, and
REVIEW.md to EnsureEntireGitignore() so they are automatically
excluded from version control for all users.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Entire-Checkpoint: ac0df591859d
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant