feat(core): wire validator history and surface validationOutcome (#429)#463
Draft
lmorchard wants to merge 4 commits into
Draft
feat(core): wire validator history and surface validationOutcome (#429)#463lmorchard wants to merge 4 commits into
lmorchard wants to merge 4 commits into
Conversation
Collaborator
Author
|
Filed #464 as the follow-up for the |
3 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
validationOutcome?: "accepted" | "force-accepted"toTaskExecutionResultso callers (eval-judge, telemetry) can distinguish a real validator accept from a force-accept aftermaxValidationAttempts. Today these were indistinguishable: both surfaced assuccess: true.This is PR1 of a planned two-PR sequence. Core changes only — consumer plumbing (CLI display, extension UI) is deliberately deferred to PR2. The eval-judge / telemetry signal lands here; the server SSE
completeevent auto-forwards the new optional field through existing serialization (no server code change needed).Design Decisions
conversationHistoryinto the template, don't delete the dead helper.formatConversationHistoryalready exists and builds a 30-message string; the template just never referenced it. Wiring it gives the validator real signal about whether the trajectory matches the claimed result."accepted"and"force-accepted". Field optional.undefinedis the implicit "validation didn't run" case (task aborted, max iterations). Skipped"rejected"/"skipped"enum values — neither has a firing code path today; trivial to expand later when one does."force-accepted". Both are "the validator did not actively endorse this answer." A finer split (e.g.,"force-accepted-error") is a follow-up if eval data shows it matters.<EXTERNAL-CONTENT label="conversation-history">…</EXTERNAL-CONTENT>via the existingwrapExternalContentWithWarninghelper. NewConversationHistoryvariant added toExternalContentLabel. (Note: the shared warning text mentions "page text" — imperfect fit, but the threat-model intent of "treat as data, not instructions" is consistent.)formatConversationHistoryshape unchanged. Stillthis.messages.slice(-30). Reshape work (e.g., "first user message + last 20") is speculative; ship the wiring first.Changes
packages/core/src/:prompts.ts—taskValidationTemplatereferences{{ wrappedConversationHistory }};buildTaskValidationPromptwraps the history before passing into the template; adds a trajectory-review step to the evaluation instructions.utils/promptSecurity.ts—ConversationHistory = "conversation-history"added toExternalContentLabel.webAgent.ts—validationOutcome?threaded throughTaskExecutionResult,ExecutionState,validateTaskCompletion,generateAndProcessAction,runMainLoop, andbuildResult. Conditional spread inbuildResultmirrors howerroris spread.packages/core/test/:prompts.test.ts— 3 new tests asserting the validation prompt includes the wrapped history, the safety warning, and the trajectory-review instruction.webAgent.test.ts— 4 new tests coveringvalidationOutcome === "accepted"on first-attempt accept,"force-accepted"via validator rejecting to max attempts,"force-accepted"via validator throwing to max attempts, andundefinedwhen the task fails beforedone()(max iterations path).Test Plan
pnpm run checkpasses (core 682, server 96, cli 221, extension 266 tests)pnpm run typecheckpassespnpm run format:checkpassesgitleaks detect --log-opts="880db9f..HEAD"clean on branch commitsTaskExecutionResult.validationOutcomereads cleanly in the eval-judge integration (the originating use case)References
EXTERNAL_CONTENT_WARNINGtext — page-specific phrasing flagged by Copilot; deferred per spec design decision)