Add agent-bot: issue triage + wiwi dev agent + auto-merge#45
Conversation
Companion to the vivi PR-reviewer. Two new workflows + one wired-in step on pr-review.yml. issue-triage.yml — fires on label `agent:assess`. Strict 5-gate verdict (clear acceptance criteria, <300 LOC, contained scope, no new deps/secrets, deterministic test). verdict=`do` auto-adds `agent:try`; else posts a comment and waits for a human. issue-implement.yml — fires on label `agent:try` (added by triage or manually). "wiwi" branches off main, implements, runs the build, opens a DRAFT PR labelled `auto-agent` with `Closes #N`. Uses AGENT_GH_TOKEN (PAT) so the PR's downstream ci/pr-review workflows actually fire. pr-review.yml — appended an "Auto-merge if eligible" step. After vivi posts her review: if the PR has `auto-agent` AND vivi's latest verdict is APPROVED AND the linked issue's author is on the TEAM list (vaderyang / william / timmy), promote draft → ready and admin-squash-merge. Non-team-authored PRs always wait for human merge. Reuses existing LITELLM_BASE_URL / LITELLM_API_KEY / LITELLM_NO_PROXY secrets. New secret AGENT_GH_TOKEN required (PAT with repo + workflow scopes) so the spawned PR triggers downstream checks and admin-merge has bypass.
There was a problem hiding this comment.
Based on my analysis, I've found several issues. Let me complete the review.
Summary
This PR adds an agent automation pipeline: issue triage (issue-triage.yml), dev agent (issue-implement.yml), and auto-merge gating (auto_merge.sh appended to pr-review.yml). The design is clean and the prompts are well-structured. However, there's a blocking jq null-safety bug that will cause auto_merge.sh to crash on any PR with reviews lacking a body field. There's also a potential double-merge race condition with the existing auto-merge logic in post_review.py.
Recommendation: REQUEST_CHANGES — the jq bug must be fixed; the auto-merge interaction should be clarified.
Blocking
- scripts/agent-bot/auto_merge.sh:20 — jq null-safety bug:
.body | contains("vivi")errors when.bodyisnull. GitHub review objects often lack a body field (e.g., quick APPROVE clicks), causing the script to fail withjq: error: null and string cannot have their containment checked. Fix: use((.body // "") | contains("vivi"))to handle nulls.
Suggestions
-
scripts/agent-bot/auto_merge.sh + post_review.py — Two auto-merge mechanisms now exist: (1)
post_review.pyauto-merges when PR author is inAUTO_MERGE_AUTHORS; (2)auto_merge.shauto-merges when PR hasauto-agentlabel AND linked issue author is in TEAM. IfAGENT_GH_TOKENPAT is owned by a team member (e.g.,vaderyang), both could attempt to merge the same wiwi-spawned PR simultaneously — one via GITHUB_TOKEN, one via AGENT_GH_TOKEN. Consider either: (a) removing theauto-agentlabel frompost_review.py's auto-merge path, or (b) documenting that AGENT_GH_TOKEN should be owned by a non-team bot account. -
scripts/agent-bot/auto_merge.sh:11 + scripts/agent-bot/run_wiwi.sh:7 — TEAM list is duplicated (
'vaderyang william timmy'). If team membership changes, both files need updates. Consider centralizing to a single source (e.g., aTEAMfile inscripts/agent-bot/or a repo variable). -
scripts/agent-bot/run_triage.sh:52 — JSON parsing uses
grep -E '^\{.*"verdict"' "$OUT" | tail -1. If the agent emits multiple JSON objects (e.g., one in a code block and one at the end), this could match the wrong line. Consider usingjq -c 'select(.verdict)'with--raw-inputfor stricter parsing. -
scripts/agent-bot/run_wiwi.sh:14 — Branch naming
agent/wiwi/issue-${ISSUE_NUMBER}could collide if wiwi is re-run on the same issue after a failed attempt (the old branch may still exist). Consider either: (a) adding a timestamp/suffix for uniqueness, or (b) documenting that re-triggering requires manual branch cleanup.
Questions
-
AGENT_GH_TOKEN ownership — The PR body states this needs to be a PAT with
repo + workflowscopes. Is the intended owner a bot account or a team member? If a team member, the double auto-merge race condition (post_review.py + auto_merge.sh) is real. If a bot account (recommended),post_review.py's auto-merge won't fire for wiwi PRs and there's no conflict — worth documenting. -
issue-implement.yml:19 —
cancel-in-progress: false— Why is this set? If a user re-labels an issue withagent:trywhile wiwi is already running (e.g., re-trigger after a stuck job), the old run continues alongside the new one. Is this intentional to avoid killing in-progress work?
Verified
- jq syntax tested:
select(.author.login=="vivi" or (.body | contains("vivi")))fails on null body;select(.author.login=="vivi" or ((.body // "") | contains("vivi")))handles nulls correctly. - Workflow trigger logic:
issue-triage.ymlandissue-implement.ymlcorrectly gate ongithub.event.label.name == 'agent:assess'/'agent:try'. - Concurrency groups: per-issue (
triage-${{ github.event.issue.number }},implement-${{ github.event.issue.number }}) prevent duplicate runs on rapid re-labels. - Script permissions: all three new scripts have
100755(executable) set. - pr-review.yml modification: correctly appends the auto-merge step after
post_review.pyruns, usingsteps.review.outcome == 'success'guard.
🤖 Reviewed by vivi • workflow run
Summary
Companion to the existing vivi PR-reviewer. Adds an issue → triage → dev → auto-merge pipeline:
issue-triage.yml— fires on labelagent:assess. Strict 5-gate verdict (clear acceptance criteria, <300 LOC, contained scope, no new deps/secrets, deterministic test). verdict=doauto-addsagent:try; otherwise posts a comment and waits for a human.issue-implement.yml— fires on labelagent:try(added by triage or manually). wiwi branches off main, implements the change, runs the build, opens a DRAFT PR labelledauto-agentwithCloses #N.pr-review.yml— gains a final "Auto-merge if eligible" step. When vivi APPROVES a PR labelledauto-agentwhose linked issue's author is on the team list (vaderyang / william / timmy), promote draft → ready and admin-squash-merge. Non-team-authored PRs always wait for a human merge.Operator setup
LITELLM_BASE_URLLITELLM_API_KEYLITELLM_NO_PROXYAGENT_GH_TOKENagent:assessagent:tryagent:skipauto-agentTest plan
AGENT_GH_TOKENrepo secret withrepo+workflowscopesagent:assessagent:tryagent:try, opens a draft PRNotes
ANTHROPIC_MODEL=claude-3-5-sonnet-20241022spoofed for LiteLLM → backend).