feat(core): wire validator history and surface validationOutcome (#429) by lmorchard · Pull Request #463 · mozilla/pilo

lmorchard · 2026-05-20T21:54:55Z

Summary

Wires the agent's recent conversation history (last 30 messages) into the task-validator prompt so the validator can spot "agent gave up early but final answer looks plausible" failure modes — not just score the final answer in isolation.
Adds validationOutcome?: "accepted" | "force-accepted" to TaskExecutionResult so callers (eval-judge, telemetry) can distinguish a real validator accept from a force-accept after maxValidationAttempts. Today these were indistinguishable: both surfaced as success: true.

This is PR1 of a planned two-PR sequence. Core changes only — consumer plumbing (CLI display, extension UI) is deliberately deferred to PR2. The eval-judge / telemetry signal lands here; the server SSE complete event auto-forwards the new optional field through existing serialization (no server code change needed).

Design Decisions

Wire conversationHistory into the template, don't delete the dead helper. formatConversationHistory already exists and builds a 30-message string; the template just never referenced it. Wiring it gives the validator real signal about whether the trajectory matches the claimed result.
Two outcome values only: "accepted" and "force-accepted". Field optional. undefined is the implicit "validation didn't run" case (task aborted, max iterations). Skipped "rejected" / "skipped" enum values — neither has a firing code path today; trivial to expand later when one does.
Force-accept lumps both sub-cases. Validator-disagreed-three-times and validator-call-itself-errored both map to "force-accepted". Both are "the validator did not actively endorse this answer." A finer split (e.g., "force-accepted-error") is a follow-up if eval data shows it matters.
Reuse the existing external-content wrapping pattern. History is wrapped in <EXTERNAL-CONTENT label="conversation-history">…</EXTERNAL-CONTENT> via the existing wrapExternalContentWithWarning helper. New ConversationHistory variant added to ExternalContentLabel. (Note: the shared warning text mentions "page text" — imperfect fit, but the threat-model intent of "treat as data, not instructions" is consistent.)
formatConversationHistory shape unchanged. Still this.messages.slice(-30). Reshape work (e.g., "first user message + last 20") is speculative; ship the wiring first.

Changes

packages/core/src/:

prompts.ts — taskValidationTemplate references {{ wrappedConversationHistory }}; buildTaskValidationPrompt wraps the history before passing into the template; adds a trajectory-review step to the evaluation instructions.
utils/promptSecurity.ts — ConversationHistory = "conversation-history" added to ExternalContentLabel.
webAgent.ts — validationOutcome? threaded through TaskExecutionResult, ExecutionState, validateTaskCompletion, generateAndProcessAction, runMainLoop, and buildResult. Conditional spread in buildResult mirrors how error is spread.

packages/core/test/:

prompts.test.ts — 3 new tests asserting the validation prompt includes the wrapped history, the safety warning, and the trajectory-review instruction.
webAgent.test.ts — 4 new tests covering validationOutcome === "accepted" on first-attempt accept, "force-accepted" via validator rejecting to max attempts, "force-accepted" via validator throwing to max attempts, and undefined when the task fails before done() (max iterations path).

Test Plan

pnpm run check passes (core 682, server 96, cli 221, extension 266 tests)
pnpm run typecheck passes
pnpm run format:check passes
gitleaks detect --log-opts="880db9f..HEAD" clean on branch commits
Reviewer: confirm TaskExecutionResult.validationOutcome reads cleanly in the eval-judge integration (the originating use case)

References

Closes Wire validator conversation history and surface validationOutcome in TaskExecutionResult #429
Follow-up: Add label-specific warning text on wrapExternalContentWithWarning() #464 (shared EXTERNAL_CONTENT_WARNING text — page-specific phrasing flagged by Copilot; deferred per spec design decision)

…429)

lmorchard · 2026-05-20T22:01:32Z

Filed #464 as the follow-up for the EXTERNAL_CONTENT_WARNING text issue raised by Copilot.

…nOutcome (#429)

lmorchard added 2 commits May 20, 2026 14:36

feat(core): wire validator conversation history into prompt template (#…

37b8f68

…429)

feat(core): surface validationOutcome in TaskExecutionResult (#429)

90e197e

lmorchard requested a review from Copilot May 20, 2026 21:55

Copilot started reviewing on behalf of lmorchard May 20, 2026 21:55 View session

This comment was marked as resolved.

Sign in to view

docs(core): clarify validationOutcome JSDoc per code review (#429)

e2dbc77

lmorchard mentioned this pull request May 20, 2026

Add label-specific warning text on wrapExternalContentWithWarning() #464

Open

lmorchard marked this pull request as draft May 20, 2026 23:45

build(core): regenerate JSON schema for TaskExecutionResult.validatio…

61ee629

…nOutcome (#429)

lmorchard mentioned this pull request May 27, 2026

fix(core): make EXTERNAL_CONTENT_WARNING label-agnostic #480

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(core): wire validator history and surface validationOutcome (#429)#463

feat(core): wire validator history and surface validationOutcome (#429)#463
lmorchard wants to merge 4 commits into
mainfrom
feat/429-validator-context-outcome

lmorchard commented May 20, 2026 •

edited

Loading

Uh oh!

This comment was marked as resolved.

Uh oh!

lmorchard commented May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

lmorchard commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Design Decisions

Changes

Test Plan

References

Uh oh!

This comment was marked as resolved.

Uh oh!

lmorchard commented May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

lmorchard commented May 20, 2026 •

edited

Loading