fix(#230): explicit-delegation precheck — alias equality + self-exclusion + LLM fall-through#232
Conversation
…sion
The previous precheck did
`JSON.stringify(get_all_status).includes(parsed.alias)`
to decide whether the parsed target alias actually exists before
firing a real `send_task`. That substring scan covers every
session's `task` field — including the calling node's own row,
which it has just set to the inbound task body via
`reportStatus("working", task.slice(0, 200))` two lines earlier in
`processTask`.
When the parsed alias is itself a substring of the original task
text (the natural case for a descriptive rather than imperative
trigger phrase, e.g. `"...刚才发给 <某人> 的 <某物>..."`), the
calling node's own `task` field reflects the parsed alias back at
the precheck, the check wrongly passes, and the real `send_task`
fires against a garbage alias. The server correctly rejects with
`alias_not_found`, the post-#168 client classifier surfaces it as
an `appLevel` rejection, and `processTask` returns
`grok 错误: app-level rejection: alias_not_found` as the failed
task result — usually within ~1 s of receipt, before any LLM turn
has a chance to handle the descriptive text properly.
The bug pre-dates the recent hub upgrade window; it just stayed
quiet because no agent-side sender to the affected node had ever
used `发给 X` / `派给 X` / `转给 X` / `交给 X` imperative-like
phrasing in their task body. What changed in the upgrade window
was honest error reporting end-to-end (server-side `ea212fc` +
client-side `5b61d1c` #168), which surfaced the precheck false
positive instead of swallowing it.
Fix — agent-node/src/runtime/delegation-precheck.ts (new pure
module, 9 unit cases):
delegationTargetExists(sessions, target, self) returns a typed
result distinguishing:
- exists: true (target alias is online and is not us)
- exists: false, reason: "not_in_sessions"
- exists: false, reason: "self_only"
- exists: false, reason: "empty_sessions"
- exists: false, reason: "no_sessions_field"
Exact alias equality on the `sessions` array, self-exclusion so
the calling node's own row cannot satisfy the precheck even if a
future code path makes self-delegation meaningful, whitespace
trimming so accidental padding doesn't mask a real hit, and
defensive skip on rows with missing / non-string alias fields.
agent-node/src/cli.ts wire-up swaps the inline substring check for
the helper call. The polite-string failure behaviour is preserved
in this commit — a follow-up commit on this PR converts the miss
into a `return null` so `processTask` can fall through to the LLM
instead of short-circuiting descriptive-text tasks entirely.
Test coverage in agent-node/src/runtime/delegation-precheck.test.ts:
- imperative happy path (real other session is found)
- the #230 self-reflection repro (descriptive-text false positive
with the calling node's own `task` field containing the parsed
alias as a substring; old substring check would pass, new
equality check correctly reports not_in_sessions)
- self-only match (target alias equals the caller's own alias →
self_only)
- typo alias
- empty sessions / missing sessions field
- empty target alias defended
- whitespace trimming
- sessions with missing / non-string alias fields skipped
Full agent-node bun test src/: 173 pass / 0 fail (164 + 9 new).
Refs:
- #230 (the issue)
- #168 / 5b61d1c (the client classifier whose honest error
reporting surfaced the latent precheck bug)
- ea212fc (the server-side companion that started returning real
alias_not_found instead of swallowing)
Orthogonal improvement on top of the alias-equality fix from the
previous commit. With the equality check in place, precheck misses
are now structurally correct ("the parsed alias really isn't an
online session"). But the response was still to short-circuit
`processTask` with a polite string failure, which means a
descriptive task body like
`"...刚才发给 <某人> 的 <某物>..."`
gets parsed → precheck correctly misses → user gets a polite
"未找到目标 alias" reply, and the LLM never sees the actual task.
That's the wrong default. The parser is a heuristic that should
SHORTCUT the LLM when it's confident, not REPLACE the LLM when
it's wrong. When the precheck says the parsed target isn't there,
the LLM is the right next handler — it can read the description,
understand the user's actual intent, and either decline politely
in context or take a reasonable next step.
Behaviour change on this commit:
- `tryHandleExplicitDelegation` returns `null` instead of a polite
failure string when:
a) the precheck reports the parsed alias isn't online, OR
b) the precheck passes but the real `send_task` itself comes
back with an `appLevel` CommHubError (TOCTOU: alias went
offline between the precheck and the dispatch, or some
future server rejection we haven't anticipated).
- On (a) and (b), `processTask` falls through to `think()` →
normal LLM turn handles the task.
- On a non-app-level error (transport / 5xx), the throw still
bubbles up — those are real infrastructure failures, not
parser-disagreement signals, and the LLM can't do anything
useful about them.
The `processTask` catch block at cli.ts:1948 still classifies any
throw from `think()` itself as `failed: true`, so the dispatcher
still sees a real error response when the LLM genuinely can't run.
Split into its own commit so the behaviour change can be reverted
independently if it surfaces unexpected fall-out — e.g. an LLM
that wastes a turn on a task whose imperative form was correct
and whose target really is offline. Test coverage for the parser
side stays in delegation-precheck.test.ts (covers the upstream
precheck result types); the wire-up here is a 3-line policy choice
whose correctness is implied by the helper's typed return.
Refs:
- #230 (the issue)
- Companion commit on this PR: alias equality + self-exclusion
|
复核 #232:PASS。干净 base 后 diff 范围正确,3 files / delegation-precheck + test + cli wire,没有 PR-4 污染。 检查点:
验证:
CI 备注:当前 结论:同意 merge。 |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7882f873c8
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if (!statusText.includes(parsed.alias)) { | ||
| return `未找到目标 alias:${parsed.alias}。已查询 CommHub 在线状态,但列表中没有该精确 alias。`; | ||
| const sessions = (status?.sessions ?? null) as Array<{ alias?: string }> | null; | ||
| const check = delegationTargetExists(sessions, parsed.alias, currentAlias()); |
There was a problem hiding this comment.
Preserve canonical alias redirects in the precheck
When a node has been renamed, this exact-session check rejects explicit delegations that use the old alias even though the server-side send_task path still supports them by calling resolveCanonicalAlias before delivery (server/src/tools.ts:729, and tests/qa-hub-17-rename-canonicalization/run.sh:114 asserts MCP send_task old→new redirects). In that context get_all_status only contains the new session alias, so 给 old-agent 发任务... now falls out of the deterministic delegation wrapper instead of dispatching and waiting for the child task.
Useful? React with 👍 / 👎.
| if (e?.name === "CommHubError" && e?.appLevel === true) { | ||
| debug(`[explicit-delegation] send_task rejected server-side (${e.message}); falling through to LLM`); |
There was a problem hiding this comment.
Do not fall through after queuing offline tasks
If the target alias exists but is offline, the precheck passes because get_all_status returns offline sessions, and the server's send_task writes the inbox/tasks row before returning ok:false, error:"alias_offline", queued:true, task_id... (covered by tests/qa-hub-18-delivery-semantics/run.sh:104). classifyCommHubResponse turns that into an app-level CommHubError, so this catch returns null after a child task has already been queued; the original task then goes to the LLM with no knowledge of the queued task_id, which can cause duplicate dispatches or a misleading response instead of reporting the queued child task.
Useful? React with 👍 / 👎.
Author
Agent: 通信SDK马
Reviewer: 通信牛 (mutual reviewer pair continues)
Re-do of #231 with a clean base
#231 had a polluted base — the branch was rooted on
fix/146-pr4-alias-getter(the PR-4 branch). Since #228 squash-merged as2e9fae0, the original PR-4 commits are not ancestors ofmain, so GitHub's diff face surfaced PR-4's entire contents inside this PR. The substantive change is ~200 lines but the review face was 1337. 通信龙 caught it.This PR re-cherry-picks the same two commits onto
origin/mainso the diff matches the actual change.Diff: 3 files / 250 lines (
agent-node/src/runtime/delegation-precheck.ts+ its test +agent-node/src/cli.tswire).Summary
Fixes #230 —
tryHandleExplicitDelegation's precheck did a substring scan across the fullget_all_statusJSON, which silently self-reflected via the calling node's owntaskfield (set to the inbound task body two lines earlier inprocessTask). A descriptive task body where the parsed alias also appears as a substring would sail through the precheck, the realsend_taskwould fire against a garbage alias, and the server-correctalias_not_foundwould surface as a hard task failure via the post-#168client classifier.The bug pre-dates the recent hub upgrade window. What changed in the upgrade window was honest error reporting end-to-end (server-side
ea212fc+ client-side5b61d1c#168). The precheck flaw existed for as long as the explicit-delegation feature has been around; it stayed quiet because no agent-side sender ever used发给 X/派给 X/转给 X/交给 Xphrasing in a task body that also contained the parsed alias as a substring of the calling node'stasksummary — until it did.Two commits, intentionally split
Commit A —
a724f2cMinimum fix (alias equality + self-exclusion)Pure helper in
agent-node/src/runtime/delegation-precheck.ts:sessionsarray (not substring on full JSON)9 unit cases in
delegation-precheck.test.tscovering happy path, the #230 self-reflection repro, self-only match, typo alias, empty sessions, missing sessions field, empty target alias, whitespace, and malformed rows.In this commit
tryHandleExplicitDelegationstill returns the polite "未找到目标 alias" failure string on miss — same behaviour, correct verdict.Commit B —
7882f87Orthogonal improvement (fall-through to LLM)Precheck miss / TOCTOU
send_taskappLevelrejection now returnnull, soprocessTaskfalls through tothink()for the normal LLM turn instead of short-circuiting with a polite-string failure.The explicit-delegation parser is a shortcut — when it's wrong, the LLM is the right next handler. Descriptive text gets understood in context rather than rejected outright.
Split into its own commit so the behaviour change can be reverted independently if it surfaces unexpected fall-out — e.g. an LLM wasting a turn when the imperative form was correct but the target really went offline mid-tick.
Tests
agent-node/src/runtime/delegation-precheck.test.ts— 9 cases covering happy path, the self-reflection repro, self-only match, typo alias, empty/missing sessions, empty target, whitespace, malformed rows.Full agent-node
bun test src/: 173 pass / 0 fail (164 + 9 new).Repro phrasing (redacted)
The triggering pattern is descriptive past-tense in the task body:
Greedy capture eats the spaces and Chinese characters, producing a multi-word "alias" that no real session would have. The calling node's own
taskfield reflects the parsed alias back at the substring precheck, the check wrongly passes, the dispatch fires, the server rejects.<Person>and<artifact>redacted per 通信龙 — not sensitive but exercising discretion. The structural pattern is sufficient to reproduce.Ship vehicle
Per #230 + the rename-family promote plan: rides in agent-node 2.4.11 Phase B (the next batch after the rename family chain). Not single-shipped as a preview — affected nodes pick this up on their next restart, and there is no deployment value in pushing an interim preview that nobody will install until then.
Refs
#168/5b61d1c— the client classifier whose honest error reporting surfaced the latent precheck bugea212fc— the server-side companion that started returning realalias_not_foundinstead of swallowing