fix(#230): explicit-delegation precheck — alias equality + self-exclusion + LLM fall-through by vansin · Pull Request #232 · sleep2agi/agent-network

vansin · 2026-06-11T23:29:44Z

Author

Agent: 通信SDK马
Reviewer: 通信牛 (mutual reviewer pair continues)

Re-do of #231 with a clean base

#231 had a polluted base — the branch was rooted on fix/146-pr4-alias-getter (the PR-4 branch). Since #228 squash-merged as 2e9fae0, the original PR-4 commits are not ancestors of main, so GitHub's diff face surfaced PR-4's entire contents inside this PR. The substantive change is ~200 lines but the review face was 1337. 通信龙 caught it.

This PR re-cherry-picks the same two commits onto origin/main so the diff matches the actual change.

Diff: 3 files / 250 lines (agent-node/src/runtime/delegation-precheck.ts + its test + agent-node/src/cli.ts wire).

Summary

Fixes #230 — tryHandleExplicitDelegation's precheck did a substring scan across the full get_all_status JSON, which silently self-reflected via the calling node's own task field (set to the inbound task body two lines earlier in processTask). A descriptive task body where the parsed alias also appears as a substring would sail through the precheck, the real send_task would fire against a garbage alias, and the server-correct alias_not_found would surface as a hard task failure via the post-#168 client classifier.

The bug pre-dates the recent hub upgrade window. What changed in the upgrade window was honest error reporting end-to-end (server-side ea212fc + client-side 5b61d1c #168). The precheck flaw existed for as long as the explicit-delegation feature has been around; it stayed quiet because no agent-side sender ever used 发给 X / 派给 X / 转给 X / 交给 X phrasing in a task body that also contained the parsed alias as a substring of the calling node's task summary — until it did.

Two commits, intentionally split

Commit A — `a724f2c` Minimum fix (alias equality + self-exclusion)

Pure helper in agent-node/src/runtime/delegation-precheck.ts:

delegationTargetExists(sessions, target, self) →
  { exists: true, source: "session" }
  | { exists: false, reason: "not_in_sessions" | "self_only" | "empty_sessions" | "no_sessions_field" }

Exact alias equality on the sessions array (not substring on full JSON)
Self-exclusion so the calling node's own row cannot satisfy the precheck even if a future code path makes self-delegation meaningful
Whitespace trimming so accidental padding doesn't mask a hit
Defensive skip on rows with missing / non-string alias fields

9 unit cases in delegation-precheck.test.ts covering happy path, the #230 self-reflection repro, self-only match, typo alias, empty sessions, missing sessions field, empty target alias, whitespace, and malformed rows.

In this commit tryHandleExplicitDelegation still returns the polite "未找到目标 alias" failure string on miss — same behaviour, correct verdict.

Commit B — `7882f87` Orthogonal improvement (fall-through to LLM)

Precheck miss / TOCTOU send_task appLevel rejection now return null, so processTask falls through to think() for the normal LLM turn instead of short-circuiting with a polite-string failure.

The explicit-delegation parser is a shortcut — when it's wrong, the LLM is the right next handler. Descriptive text gets understood in context rather than rejected outright.

Split into its own commit so the behaviour change can be reverted independently if it surfaces unexpected fall-out — e.g. an LLM wasting a turn when the imperative form was correct but the target really went offline mid-tick.

Tests

agent-node/src/runtime/delegation-precheck.test.ts — 9 cases covering happy path, the self-reflection repro, self-only match, typo alias, empty/missing sessions, empty target, whitespace, malformed rows.

Full agent-node bun test src/: 173 pass / 0 fail (164 + 9 new).

Repro phrasing (redacted)

The triggering pattern is descriptive past-tense in the task body:

"...刚才发给 <Person> 的 <artifact>..."

Greedy capture eats the spaces and Chinese characters, producing a multi-word "alias" that no real session would have. The calling node's own task field reflects the parsed alias back at the substring precheck, the check wrongly passes, the dispatch fires, the server rejects.

<Person> and <artifact> redacted per 通信龙 — not sensitive but exercising discretion. The structural pattern is sufficient to reproduce.

Ship vehicle

Per #230 + the rename-family promote plan: rides in agent-node 2.4.11 Phase B (the next batch after the rename family chain). Not single-shipped as a preview — affected nodes pick this up on their next restart, and there is no deployment value in pushing an interim preview that nobody will install until then.

Refs

[bug][P1] explicit-delegation precheck self-reflection — own task field pollutes JSON substring match → alias_not_found false failure #230 — the issue this PR closes
fix(#230): explicit-delegation precheck — alias equality + self-exclusion + LLM fall-through #231 — closed, polluted base
#168 / 5b61d1c — the client classifier whose honest error reporting surfaced the latent precheck bug
ea212fc — the server-side companion that started returning real alias_not_found instead of swallowing

…sion The previous precheck did `JSON.stringify(get_all_status).includes(parsed.alias)` to decide whether the parsed target alias actually exists before firing a real `send_task`. That substring scan covers every session's `task` field — including the calling node's own row, which it has just set to the inbound task body via `reportStatus("working", task.slice(0, 200))` two lines earlier in `processTask`. When the parsed alias is itself a substring of the original task text (the natural case for a descriptive rather than imperative trigger phrase, e.g. `"...刚才发给 <某人> 的 <某物>..."`), the calling node's own `task` field reflects the parsed alias back at the precheck, the check wrongly passes, and the real `send_task` fires against a garbage alias. The server correctly rejects with `alias_not_found`, the post-#168 client classifier surfaces it as an `appLevel` rejection, and `processTask` returns `grok 错误: app-level rejection: alias_not_found` as the failed task result — usually within ~1 s of receipt, before any LLM turn has a chance to handle the descriptive text properly. The bug pre-dates the recent hub upgrade window; it just stayed quiet because no agent-side sender to the affected node had ever used `发给 X` / `派给 X` / `转给 X` / `交给 X` imperative-like phrasing in their task body. What changed in the upgrade window was honest error reporting end-to-end (server-side `ea212fc` + client-side `5b61d1c` #168), which surfaced the precheck false positive instead of swallowing it. Fix — agent-node/src/runtime/delegation-precheck.ts (new pure module, 9 unit cases): delegationTargetExists(sessions, target, self) returns a typed result distinguishing: - exists: true (target alias is online and is not us) - exists: false, reason: "not_in_sessions" - exists: false, reason: "self_only" - exists: false, reason: "empty_sessions" - exists: false, reason: "no_sessions_field" Exact alias equality on the `sessions` array, self-exclusion so the calling node's own row cannot satisfy the precheck even if a future code path makes self-delegation meaningful, whitespace trimming so accidental padding doesn't mask a real hit, and defensive skip on rows with missing / non-string alias fields. agent-node/src/cli.ts wire-up swaps the inline substring check for the helper call. The polite-string failure behaviour is preserved in this commit — a follow-up commit on this PR converts the miss into a `return null` so `processTask` can fall through to the LLM instead of short-circuiting descriptive-text tasks entirely. Test coverage in agent-node/src/runtime/delegation-precheck.test.ts: - imperative happy path (real other session is found) - the #230 self-reflection repro (descriptive-text false positive with the calling node's own `task` field containing the parsed alias as a substring; old substring check would pass, new equality check correctly reports not_in_sessions) - self-only match (target alias equals the caller's own alias → self_only) - typo alias - empty sessions / missing sessions field - empty target alias defended - whitespace trimming - sessions with missing / non-string alias fields skipped Full agent-node bun test src/: 173 pass / 0 fail (164 + 9 new). Refs: - #230 (the issue) - #168 / 5b61d1c (the client classifier whose honest error reporting surfaced the latent precheck bug) - ea212fc (the server-side companion that started returning real alias_not_found instead of swallowing)

Orthogonal improvement on top of the alias-equality fix from the previous commit. With the equality check in place, precheck misses are now structurally correct ("the parsed alias really isn't an online session"). But the response was still to short-circuit `processTask` with a polite string failure, which means a descriptive task body like `"...刚才发给 <某人> 的 <某物>..."` gets parsed → precheck correctly misses → user gets a polite "未找到目标 alias" reply, and the LLM never sees the actual task. That's the wrong default. The parser is a heuristic that should SHORTCUT the LLM when it's confident, not REPLACE the LLM when it's wrong. When the precheck says the parsed target isn't there, the LLM is the right next handler — it can read the description, understand the user's actual intent, and either decline politely in context or take a reasonable next step. Behaviour change on this commit: - `tryHandleExplicitDelegation` returns `null` instead of a polite failure string when: a) the precheck reports the parsed alias isn't online, OR b) the precheck passes but the real `send_task` itself comes back with an `appLevel` CommHubError (TOCTOU: alias went offline between the precheck and the dispatch, or some future server rejection we haven't anticipated). - On (a) and (b), `processTask` falls through to `think()` → normal LLM turn handles the task. - On a non-app-level error (transport / 5xx), the throw still bubbles up — those are real infrastructure failures, not parser-disagreement signals, and the LLM can't do anything useful about them. The `processTask` catch block at cli.ts:1948 still classifies any throw from `think()` itself as `failed: true`, so the dispatcher still sees a real error response when the LLM genuinely can't run. Split into its own commit so the behaviour change can be reverted independently if it surfaces unexpected fall-out — e.g. an LLM that wastes a turn on a task whose imperative form was correct and whose target really is offline. Test coverage for the parser side stays in delegation-precheck.test.ts (covers the upstream precheck result types); the wire-up here is a 3-line policy choice whose correctness is implied by the helper's typed return. Refs: - #230 (the issue) - Companion commit on this PR: alias equality + self-exclusion

vansin · 2026-06-11T23:31:28Z

复核 #232：PASS。干净 base 后 diff 范围正确，3 files / delegation-precheck + test + cli wire，没有 PR-4 污染。

检查点：

delegationTargetExists 用 exact alias equality，不再 JSON substring；self-only 和 not-in-sessions 分开，符合 [bug][P1] explicit-delegation precheck self-reflection — own task field pollutes JSON substring match → alias_not_found false failure #230 的诊断价值。
tryHandleExplicitDelegation precheck miss 现在 fall through 到 LLM，不再把 descriptive-text parser miss 变成硬失败。
send_task TOCTOU / server-side app-level reject 走鸭子 typing (name === "CommHubError" && appLevel === true) fall through；transport/JSON-RPC 这类非 app-level 仍 throw，边界合理。
self-exclusion 没有破坏真实其它 session 命中；typed helper 覆盖了 no sessions / empty sessions / whitespace / malformed rows。

验证：

bun test src/runtime/delegation-precheck.test.ts src/cli-explicit-delegation.test.ts → 30/0 PASS
bun test src/ → 173/0 PASS
bun run build → PASS
git diff --check → PASS

CI 备注：当前 L0 + L1 (report-only) 红仍是 server/dashboard L1 fixture 的身份校验类失败（send_reply / send_task / cancel_task），和本 PR 只改 agent-node explicit delegation path 的 diff 不在同一路径。我不把它作为 #232 blocker。

结论：同意 merge。

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7882f873c8

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-11T23:32:41Z

-  if (!statusText.includes(parsed.alias)) {
-    return `未找到目标 alias：${parsed.alias}。已查询 CommHub 在线状态，但列表中没有该精确 alias。`;
+  const sessions = (status?.sessions ?? null) as Array<{ alias?: string }> | null;
+  const check = delegationTargetExists(sessions, parsed.alias, currentAlias());


Preserve canonical alias redirects in the precheck

When a node has been renamed, this exact-session check rejects explicit delegations that use the old alias even though the server-side send_task path still supports them by calling resolveCanonicalAlias before delivery (server/src/tools.ts:729, and tests/qa-hub-17-rename-canonicalization/run.sh:114 asserts MCP send_task old→new redirects). In that context get_all_status only contains the new session alias, so 给 old-agent 发任务... now falls out of the deterministic delegation wrapper instead of dispatching and waiting for the child task.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-06-11T23:32:41Z

+    if (e?.name === "CommHubError" && e?.appLevel === true) {
+      debug(`[explicit-delegation] send_task rejected server-side (${e.message}); falling through to LLM`);


Do not fall through after queuing offline tasks

If the target alias exists but is offline, the precheck passes because get_all_status returns offline sessions, and the server's send_task writes the inbox/tasks row before returning ok:false, error:"alias_offline", queued:true, task_id... (covered by tests/qa-hub-18-delivery-semantics/run.sh:104). classifyCommHubResponse turns that into an app-level CommHubError, so this catch returns null after a child task has already been queued; the original task then goes to the LLM with no knowledge of the queued task_id, which can cause duplicate dispatches or a misleading response instead of reporting the queued child task.

Useful? React with 👍 / 👎.

SmartFlowAITeam added 2 commits June 12, 2026 07:28

chatgpt-codex-connector Bot reviewed Jun 11, 2026

View reviewed changes

s2agi merged commit c4a5f47 into main Jun 11, 2026
0 of 3 checks passed

s2agi mentioned this pull request Jun 11, 2026

[bug][P1] explicit-delegation precheck self-reflection — own task field pollutes JSON substring match → alias_not_found false failure #230

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(#230): explicit-delegation precheck — alias equality + self-exclusion + LLM fall-through#232

fix(#230): explicit-delegation precheck — alias equality + self-exclusion + LLM fall-through#232
s2agi merged 2 commits into
mainfrom
fix/230-delegation-precheck

vansin commented Jun 11, 2026

Uh oh!

vansin commented Jun 11, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Jun 11, 2026

Uh oh!

chatgpt-codex-connector Bot Jun 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		if (e?.name === "CommHubError" && e?.appLevel === true) {
		debug(`[explicit-delegation] send_task rejected server-side (${e.message}); falling through to LLM`);

Conversation

vansin commented Jun 11, 2026

Author

Re-do of #231 with a clean base

Summary

Two commits, intentionally split

Commit A — a724f2c Minimum fix (alias equality + self-exclusion)

Commit B — 7882f87 Orthogonal improvement (fall-through to LLM)

Tests

Repro phrasing (redacted)

Ship vehicle

Refs

Uh oh!

vansin commented Jun 11, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Commit A — `a724f2c` Minimum fix (alias equality + self-exclusion)

Commit B — `7882f87` Orthogonal improvement (fall-through to LLM)