feat: Closed PR Review Auto-Improver (automated feedback loop)#1755
Draft
nick-inkeep wants to merge 17 commits intomainfrom
Draft
feat: Closed PR Review Auto-Improver (automated feedback loop)#1755nick-inkeep wants to merge 17 commits intomainfrom
nick-inkeep wants to merge 17 commits intomainfrom
Conversation
Automated system that analyzes human reviewer feedback after PRs are merged to identify generalizable improvements for the pr-review-* subagent system. - Workflow triggers on merged PRs, extracts human/bot comments - Agent applies 4-criteria generalizability test - Creates draft PRs with improvements to pr-review-*.md files Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Include diffHunk in GraphQL query (shows code each comment is on) - Add Phase 2 "Deep-Dive on Promising Comments" with explicit guidance: - Read the full file to understand broader context - Grep for schemas/types/patterns mentioned in comments - Understand the anti-pattern before judging generalizability - Update Tool Policy to emphasize context gathering - Renumber phases (now 6 phases total) The agent now actively investigates each comment rather than judging based on comment text alone. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Based on write-agent skill guidance: 1. Add near-miss example (questions/discussions ≠ reviewer feedback) 2. Strengthen Role & Mission - describe what "excellence looks like" 3. Failure modes now use contrastive examples (❌ vs ✅) 4. Phase 2 now checklist format with stop condition 5. Example shows completed checklist, not just steps Key insight: "Stop here if you can't articulate a clear principle" prevents vague improvements from polluting reviewers. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Phase 2 now uses git rev-list + git show to see code at comment time - Progressive gathering: diffHunk → full file → PR diff → other files - GraphQL query now includes createdAt for all comment types - Added git rev-list and git show to allowedTools This ensures the agent sees what the human reviewer saw, not the final merged state which may have fixes applied. Co-Authored-By: Claude <noreply@anthropic.com>
Two exit paths at each level: - EXIT A: Not generalizable (repo-specific, one-off bug, style preference) - EXIT B: Pattern found (can articulate anti-pattern + universal principle) Includes decision flow diagram and two contrasting examples showing early exit (repo-specific DateUtils) vs pattern discovery (type/schema DRY). Co-Authored-By: Claude <noreply@anthropic.com>
- Role & Mission: Add "what the best human analyst would do" section - Failure modes: Add "Asserting when uncertain" with contrastive example - Generalizability: Add confidence calibration guidance - Add explicit conservative default: "when torn, choose lower confidence" Per write-agent skill review: personality should describe best human behavior, failure modes should include asserting when uncertain (relevant for classification tasks). Co-Authored-By: Claude <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
Pattern extracted from PR #1737 human reviewer feedback (amikofalvy): - Types should derive from Zod schemas using z.infer<typeof schema> - Use Pick/Omit/Partial instead of manually redefining type subsets - Extract shared enum/union schemas instead of inline string literals Changes: - pr-review-types.md: New anti-pattern + analysis step 6 with detection patterns - pr-review-consistency.md: Extended "Reuse" section to cover types This demonstrates the closed-pr-review-auto-improver output — these are the exact changes the agent proposed when run against PR #1737. Co-Authored-By: Claude <noreply@anthropic.com>
Extended "Schema-Type Derivation Discipline" to cover full spectrum: - Zod/validation schemas (z.infer) - Database schemas (Prisma, Drizzle generated types) - Internal packages (@inkeep/*, shared types) - External packages/SDKs (OpenAI, Vercel AI SDK) - Function signatures (Parameters<>, ReturnType<>) - Existing domain types (Pick, Omit, Partial) Added table format for clarity and comprehensive detection patterns. Co-Authored-By: Claude <noreply@anthropic.com>
Expanded type derivation guidance based on actual patterns found in agents repo: - Awaited<ReturnType<>> for async function returns - keyof typeof for constants-derived types - interface extends and intersection (&) for composition - Discriminated unions with type guards - satisfies operator for type-safe constants - Re-exports for API surface boundaries - Type duplication detection signals Patterns sourced from agents-api codebase analysis including: - env.ts, middleware/*, types/app.ts, domains/run/* Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Added guidance for Zod schema extension/derivation patterns based on
codebase research (packages/agents-core/src/validation/schemas.ts):
- .extend() for adding/overriding fields
- .pick()/.omit() for field subsetting
- .partial() for Insert → Update schema derivation
- .extend().refine() for cross-field validation
- Anti-patterns: parallel schemas, duplicated fields
Examples from codebase:
- SubAgentInsertSchema.extend({ id: ResourceIdSchema })
- SubAgentUpdateSchema = SubAgentInsertSchema.partial()
- StopWhenSchema.pick({ transferCountIs: true })
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Clear separation of concerns: - pr-review-types: Illegal states, invariants, unsafe narrowing - pr-review-consistency: DRY, schema reuse, convention conformance Moved to consistency: - Zod schema composition patterns (.extend, .pick, .partial) - Type derivation detection signals - satisfies operator, re-exports conventions Kept in types (type safety focus): - Discriminated unions vs optional fields (prevents illegal states) - Type guards vs unsafe `as` assertions - Detection of union types without discriminants Added cross-reference note in types agent pointing to consistency for derivation/DRY concerns. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…and Phase 5.5 - Add skills: pr-review-subagents-available, pr-review-subagents-guidelines, find-similar-patterns - Add proper exit states at Phase 1, 2, and 4 (embedded in workflow, not separate section) - Add Phase 5 step 2: "Find examples of the pattern" with judgment guidance - Add Phase 5.5: Full file review & integration planning (scope fit, duplication check) - Update output contract with detailed JSON structure and exit examples - Add reviewer tagging to close the feedback loop Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Agents should be self-contained without cross-references to other agents. This prevents coupling and ensures agents work correctly when read in isolation. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
These skills were created in the previous session but never committed. Recovered from conversation history. - find-similar-patterns: Methodology for finding similar code patterns - pr-review-subagents-available: Catalog of pr-review-* agents with scope boundaries - pr-review-subagents-guidelines: Best practices for writing/improving reviewers Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The pr-review-consistency.md and pr-review-types.md improvements belong in PR #1759, not this auto-improver feature branch. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Move agent and skills to inkeep/internal-cc-plugins for CI/CD-only loading:
- Removed: .claude/agents/closed-pr-review-auto-improver.md
- Removed: .agents/skills/{find-similar-patterns,pr-review-subagents-available,pr-review-subagents-guidelines}/
Updated workflow:
- Added step to clone inkeep/internal-cc-plugins
- Added --plugin-dir flag to load agent from plugin
Prerequisites before merging:
1. Create private repo: inkeep/internal-cc-plugins
2. Push plugin content to new repo
3. Add GH_PAT_PLUGINS secret to inkeep/agents
Co-Authored-By: Claude <noreply@anthropic.com>
GitHub Apps provide better security and maintainability: - 8-hour token lifetime (vs days/infinite for PATs) - No user account dependency (survives personnel changes) - Zero manual rotation (tokens generated fresh each run) - Scales to N plugins without additional credentials Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Introduces an automated system that learns from human reviewers to continuously improve our AI code review agents.
The Problem
When human reviewers catch issues that our
pr-review-*agents miss, that knowledge currently dies with the PR. We manually noticed patterns like "Type Definition Discipline" (PR #1737) but there's no systematic way to:The Solution
A GitHub Actions workflow that triggers after every PR merge:
Key innovation: Git time-travel — The agent reconstructs what the human reviewer saw at comment time (not the final merged state), since issues are often fixed before merge.
How It Works
pull_request_target: [closed]+merged == truegit rev-list --before+git showto see code at comment timepr-review-*.mdGeneralizability Test (all must pass)
Conservative by default: Better to miss a good pattern than pollute reviewers with repo-specific noise.
Files
.github/workflows/closed-pr-review-auto-improver.yml.claude/agents/closed-pr-review-auto-improver.mdExample Output
When the agent finds a HIGH-generalizability pattern, it creates a draft PR like:
Design Decisions
Test Plan
This PR implements the feedback loop: human reviewers catch patterns → this agent extracts generalizable improvements → pr-review- agents get better → fewer gaps for humans to catch.*