feat: add pre-approval tool input guardrails#3487
Open
seratch wants to merge 1 commit into
Open
Conversation
|
Nice boundary to make opt-in. One invariant that may be worth adding to the tests/docs here: a human approval should authorize the same validated tool request, not just the same tool name after time has passed. A small approval receipt shape would make that inspectable for both regular runner and realtime sessions: tool call id, tool name, input hash or canonicalized args, guardrail config/version, approval decision time, and post-approval revalidation result. Then the regression test can prove that if args or guardrail behavior change between pending approval and resume, execution fails closed or requests approval again instead of treating the older approval as current authority. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request adds an opt-in
pre_approval_tool_input_guardrailssetting for local function tools in both regular runner execution and realtime sessions.When enabled, function-tool input guardrails run before a pending human approval interruption is emitted. If the guardrail rejects the call, the SDK returns the guardrail message as tool output without surfacing an approval request or executing the tool. Calls that pass the pre-approval check still run the same input guardrails again immediately before execution after approval, so time-sensitive checks are revalidated on resume.
see also: openai/openai-agents-js#1358