feat: Closed PR Review Auto-Improver (automated feedback loop) by nick-inkeep · Pull Request #1755 · inkeep/agents

nick-inkeep · 2026-02-06T00:44:07Z

Summary

Introduces an automated system that learns from human reviewers to continuously improve our AI code review agents.

The Problem

When human reviewers catch issues that our pr-review-* agents miss, that knowledge currently dies with the PR. We manually noticed patterns like "Type Definition Discipline" (PR #1737) but there's no systematic way to:

Detect when humans catch something bots missed
Determine if it's a generalizable pattern (vs repo-specific)
Propose improvements to the reviewer agents

The Solution

A GitHub Actions workflow that triggers after every PR merge:

PR Merged → Extract Human Comments → Analyze Gaps → Apply Generalizability Test → Create Draft PR (if HIGH)

Key innovation: Git time-travel — The agent reconstructs what the human reviewer saw at comment time (not the final merged state), since issues are often fixed before merge.

How It Works

Trigger: pull_request_target: [closed] + merged == true
Extract: GraphQL query fetches all comment types (discussion, inline, reviews)
Filter: Removes bot comments and trivial human comments ("LGTM", "thanks")
Analyze: Agent investigates each promising comment:
- Uses git rev-list --before + git show to see code at comment time
- Progressive context gathering (diffHunk → full file → PR diff → other files)
- Explicit stop conditions: EXIT A (not generalizable) or EXIT B (pattern found)
Test: 4-criteria generalizability test (cross-codebase, universal principle, expressible, industry-recognized)
Output: If HIGH generalizability → creates draft PR with improvements to pr-review-*.md

Generalizability Test (all must pass)

Criterion	Question
Cross-codebase	Would this pattern appear in other TS/React/Node codebases?
Universal principle	Is it DRY, SOLID, separation of concerns, etc.?
Expressible	Can it be a checklist item, detection pattern, or failure mode?
Industry recognition	Would senior engineers elsewhere recognize this?

Conservative by default: Better to miss a good pattern than pollute reviewers with repo-specific noise.

Files

File	Purpose
`.github/workflows/closed-pr-review-auto-improver.yml`	Workflow: trigger, comment extraction, context passing
`.claude/agents/closed-pr-review-auto-improver.md`	Agent: analysis framework, stop conditions, output contract

Example Output

When the agent finds a HIGH-generalizability pattern, it creates a draft PR like:

pr-review: Learnings from PR #1737

Patterns extracted from human reviewer feedback:
- Type Definition Discipline: Check if new types should derive from existing schemas

Design Decisions

Draft PRs (not auto-merge): Human review of proposed improvements
HIGH only: MEDIUM patterns are noted but don't create PRs
Opus model: Pattern recognition requires strong reasoning
No nested subagents: Runs as workflow-triggered agent

Test Plan

Verify workflow triggers on merged PRs (not closed-without-merge)
Test comment extraction against a known PR with human comments
Verify bot filtering excludes claude-code, dependabot, etc.
Test git time-travel reconstructs correct code state
Verify agent correctly classifies repo-specific vs generalizable patterns
Confirm draft PR creation with proper formatting

This PR implements the feedback loop: human reviewers catch patterns → this agent extracts generalizable improvements → pr-review- agents get better → fewer gaps for humans to catch.*

Automated system that analyzes human reviewer feedback after PRs are merged to identify generalizable improvements for the pr-review-* subagent system. - Workflow triggers on merged PRs, extracts human/bot comments - Agent applies 4-criteria generalizability test - Creates draft PRs with improvements to pr-review-*.md files Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Include diffHunk in GraphQL query (shows code each comment is on) - Add Phase 2 "Deep-Dive on Promising Comments" with explicit guidance: - Read the full file to understand broader context - Grep for schemas/types/patterns mentioned in comments - Understand the anti-pattern before judging generalizability - Update Tool Policy to emphasize context gathering - Renumber phases (now 6 phases total) The agent now actively investigates each comment rather than judging based on comment text alone. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Based on write-agent skill guidance: 1. Add near-miss example (questions/discussions ≠ reviewer feedback) 2. Strengthen Role & Mission - describe what "excellence looks like" 3. Failure modes now use contrastive examples (❌ vs ✅) 4. Phase 2 now checklist format with stop condition 5. Example shows completed checklist, not just steps Key insight: "Stop here if you can't articulate a clear principle" prevents vague improvements from polluting reviewers. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Phase 2 now uses git rev-list + git show to see code at comment time - Progressive gathering: diffHunk → full file → PR diff → other files - GraphQL query now includes createdAt for all comment types - Added git rev-list and git show to allowedTools This ensures the agent sees what the human reviewer saw, not the final merged state which may have fixes applied. Co-Authored-By: Claude <noreply@anthropic.com>

Two exit paths at each level: - EXIT A: Not generalizable (repo-specific, one-off bug, style preference) - EXIT B: Pattern found (can articulate anti-pattern + universal principle) Includes decision flow diagram and two contrasting examples showing early exit (repo-specific DateUtils) vs pattern discovery (type/schema DRY). Co-Authored-By: Claude <noreply@anthropic.com>

- Role & Mission: Add "what the best human analyst would do" section - Failure modes: Add "Asserting when uncertain" with contrastive example - Generalizability: Add confidence calibration guidance - Add explicit conservative default: "when torn, choose lower confidence" Per write-agent skill review: personality should describe best human behavior, failure modes should include asserting when uncertain (relevant for classification tasks). Co-Authored-By: Claude <noreply@anthropic.com>

vercel · 2026-02-06T00:44:12Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
agents-api	Ready	Preview, Comment	Feb 6, 2026 4:54am
agents-docs	Ready	Preview, Comment	Feb 6, 2026 4:54am
agents-manage-ui	Ready	Preview, Comment	Feb 6, 2026 4:54am

changeset-bot · 2026-02-06T00:44:16Z

⚠️ No Changeset found

Latest commit: c635ab5

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

Pattern extracted from PR #1737 human reviewer feedback (amikofalvy): - Types should derive from Zod schemas using z.infer<typeof schema> - Use Pick/Omit/Partial instead of manually redefining type subsets - Extract shared enum/union schemas instead of inline string literals Changes: - pr-review-types.md: New anti-pattern + analysis step 6 with detection patterns - pr-review-consistency.md: Extended "Reuse" section to cover types This demonstrates the closed-pr-review-auto-improver output — these are the exact changes the agent proposed when run against PR #1737. Co-Authored-By: Claude <noreply@anthropic.com>

Extended "Schema-Type Derivation Discipline" to cover full spectrum: - Zod/validation schemas (z.infer) - Database schemas (Prisma, Drizzle generated types) - Internal packages (@inkeep/*, shared types) - External packages/SDKs (OpenAI, Vercel AI SDK) - Function signatures (Parameters<>, ReturnType<>) - Existing domain types (Pick, Omit, Partial) Added table format for clarity and comprehensive detection patterns. Co-Authored-By: Claude <noreply@anthropic.com>

Expanded type derivation guidance based on actual patterns found in agents repo: - Awaited<ReturnType<>> for async function returns - keyof typeof for constants-derived types - interface extends and intersection (&) for composition - Discriminated unions with type guards - satisfies operator for type-safe constants - Re-exports for API surface boundaries - Type duplication detection signals Patterns sourced from agents-api codebase analysis including: - env.ts, middleware/*, types/app.ts, domains/run/* Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Added guidance for Zod schema extension/derivation patterns based on codebase research (packages/agents-core/src/validation/schemas.ts): - .extend() for adding/overriding fields - .pick()/.omit() for field subsetting - .partial() for Insert → Update schema derivation - .extend().refine() for cross-field validation - Anti-patterns: parallel schemas, duplicated fields Examples from codebase: - SubAgentInsertSchema.extend({ id: ResourceIdSchema }) - SubAgentUpdateSchema = SubAgentInsertSchema.partial() - StopWhenSchema.pick({ transferCountIs: true }) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Clear separation of concerns: - pr-review-types: Illegal states, invariants, unsafe narrowing - pr-review-consistency: DRY, schema reuse, convention conformance Moved to consistency: - Zod schema composition patterns (.extend, .pick, .partial) - Type derivation detection signals - satisfies operator, re-exports conventions Kept in types (type safety focus): - Discriminated unions vs optional fields (prevents illegal states) - Type guards vs unsafe `as` assertions - Detection of union types without discriminants Added cross-reference note in types agent pointing to consistency for derivation/DRY concerns. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

…and Phase 5.5 - Add skills: pr-review-subagents-available, pr-review-subagents-guidelines, find-similar-patterns - Add proper exit states at Phase 1, 2, and 4 (embedded in workflow, not separate section) - Add Phase 5 step 2: "Find examples of the pattern" with judgment guidance - Add Phase 5.5: Full file review & integration planning (scope fit, duplication check) - Update output contract with detailed JSON structure and exit examples - Add reviewer tagging to close the feedback loop Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Agents should be self-contained without cross-references to other agents. This prevents coupling and ensures agents work correctly when read in isolation. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

These skills were created in the previous session but never committed. Recovered from conversation history. - find-similar-patterns: Methodology for finding similar code patterns - pr-review-subagents-available: Catalog of pr-review-* agents with scope boundaries - pr-review-subagents-guidelines: Best practices for writing/improving reviewers Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

The pr-review-consistency.md and pr-review-types.md improvements belong in PR #1759, not this auto-improver feature branch. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Move agent and skills to inkeep/internal-cc-plugins for CI/CD-only loading: - Removed: .claude/agents/closed-pr-review-auto-improver.md - Removed: .agents/skills/{find-similar-patterns,pr-review-subagents-available,pr-review-subagents-guidelines}/ Updated workflow: - Added step to clone inkeep/internal-cc-plugins - Added --plugin-dir flag to load agent from plugin Prerequisites before merging: 1. Create private repo: inkeep/internal-cc-plugins 2. Push plugin content to new repo 3. Add GH_PAT_PLUGINS secret to inkeep/agents Co-Authored-By: Claude <noreply@anthropic.com>

GitHub Apps provide better security and maintainability: - 8-hour token lifetime (vs days/infinite for PATs) - No user account dependency (survives personnel changes) - Zero manual rotation (tokens generated fresh each run) - Scales to N plugins without additional credentials Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

nick-inkeep and others added 6 commits February 5, 2026 16:20

vercel bot deployed to Preview – agents-api February 6, 2026 00:45 View deployment

vercel bot deployed to Preview – agents-manage-ui February 6, 2026 00:46 View deployment

vercel bot deployed to Preview – agents-docs February 6, 2026 00:46 View deployment

vercel bot deployed to Preview – agents-api February 6, 2026 01:03 View deployment

vercel bot deployed to Preview – agents-api February 6, 2026 01:05 View deployment

vercel bot deployed to Preview – agents-manage-ui February 6, 2026 01:06 View deployment

vercel bot deployed to Preview – agents-docs February 6, 2026 01:06 View deployment

nick-inkeep and others added 2 commits February 5, 2026 17:10

vercel bot deployed to Preview – agents-api February 6, 2026 01:14 View deployment

vercel bot deployed to Preview – agents-manage-ui February 6, 2026 01:14 View deployment

vercel bot deployed to Preview – agents-docs February 6, 2026 01:15 View deployment

vercel bot deployed to Preview – agents-api February 6, 2026 01:25 View deployment

vercel bot deployed to Preview – agents-manage-ui February 6, 2026 01:26 View deployment

vercel bot deployed to Preview – agents-docs February 6, 2026 01:26 View deployment

vercel bot deployed to Preview – agents-api February 6, 2026 03:43 View deployment

vercel bot deployed to Preview – agents-manage-ui February 6, 2026 03:43 View deployment

vercel bot deployed to Preview – agents-docs February 6, 2026 03:43 View deployment

closed-pr-review-auto-improver: Add "keep agents standalone" guidance

e069d56

Agents should be self-contained without cross-references to other agents. This prevents coupling and ensures agents work correctly when read in isolation. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

vercel bot deployed to Preview – agents-api February 6, 2026 03:51 View deployment

vercel bot deployed to Preview – agents-manage-ui February 6, 2026 03:52 View deployment

vercel bot deployed to Preview – agents-docs February 6, 2026 03:52 View deployment

vercel bot deployed to Preview – agents-api February 6, 2026 04:11 View deployment

vercel bot deployed to Preview – agents-manage-ui February 6, 2026 04:12 View deployment

vercel bot deployed to Preview – agents-docs February 6, 2026 04:12 View deployment

Remove pr-review-* changes (moved to separate PR #1759)

fbf35a4

The pr-review-consistency.md and pr-review-types.md improvements belong in PR #1759, not this auto-improver feature branch. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

vercel bot deployed to Preview – agents-api February 6, 2026 04:17 View deployment

vercel bot deployed to Preview – agents-manage-ui February 6, 2026 04:18 View deployment

vercel bot deployed to Preview – agents-docs February 6, 2026 04:18 View deployment

vercel bot deployed to Preview – agents-api February 6, 2026 04:46 View deployment

vercel bot deployed to Preview – agents-manage-ui February 6, 2026 04:47 View deployment

vercel bot deployed to Preview – agents-docs February 6, 2026 04:47 View deployment

vercel bot deployed to Preview – agents-api February 6, 2026 04:53 View deployment

vercel bot deployed to Preview – agents-manage-ui February 6, 2026 04:54 View deployment

vercel bot deployed to Preview – agents-docs February 6, 2026 04:54 View deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Closed PR Review Auto-Improver (automated feedback loop)#1755

feat: Closed PR Review Auto-Improver (automated feedback loop)#1755
nick-inkeep wants to merge 17 commits intomainfrom
feature/closed-pr-review-auto-improver

nick-inkeep commented Feb 6, 2026

Uh oh!

vercel bot commented Feb 6, 2026 •

edited

Loading

Uh oh!

changeset-bot bot commented Feb 6, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

nick-inkeep commented Feb 6, 2026

Summary

The Problem

The Solution

How It Works

Generalizability Test (all must pass)

Files

Example Output

Design Decisions

Test Plan

Uh oh!

vercel bot commented Feb 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

changeset-bot bot commented Feb 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ No Changeset found

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vercel bot commented Feb 6, 2026 •

edited

Loading

changeset-bot bot commented Feb 6, 2026 •

edited

Loading