Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion python/packages/core/AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -120,7 +120,10 @@ agent_framework/

- **`AgentLoopMiddleware`** - `AgentMiddleware` that re-runs an agent in a loop by calling `call_next()` repeatedly (the pipeline re-reads `context.messages` each time). One configurable class covers two patterns: a required user `should_continue` predicate (sync or async, the first positional/keyword arg), and a chat-client judge built via the `.with_judge(...)` factory (a second chat client decides whether the original request was answered; loops while it is *not*, using a `JudgeVerdict` structured-output response — internally just an async `should_continue` predicate). The constructor covers the predicate pattern directly; only the judge has a convenience classmethod factory (`.with_judge(judge_client, ...)`) that forwards to `__init__`. Supports both streaming and non-streaming runs. By default a non-streaming run returns an aggregated `AgentResponse` containing every iteration's messages plus the injected `next_message` "nudge" messages (as `user` messages); set `return_final_only=True` to return only the last iteration's response. Streaming runs always yield each iteration's updates and emit the injected nudge messages as `user` updates between iterations (the `return_final_only` flag has no effect on streaming, and the final response reflects the last iteration; `MiddlewareTermination` is handled cleanly). `should_continue` is required; other constructor args are optional: `max_iterations` (safety cap; defaults to `DEFAULT_MAX_ITERATIONS`=10, explicit `None`→unbounded, positive int caps; `.with_judge` uses `DEFAULT_JUDGE_MAX_ITERATIONS`=5 as its default), `next_message` (defaults to a short "continue" nudge), `return_final_only`, and `additional_instructions` (an extra `system` message injected ahead of the input before the agent runs — becomes part of the original messages so it survives `fresh_context` resets and persists via a session). The judge is configured only through `.with_judge` (`judge_client`/`instructions`/`criteria`), not the constructor, and its `reasoning` is fed back to the agent as the next iteration's input; the judge forwards the original request messages and the agent's latest response messages verbatim so multi-modal content is preserved. `criteria` (a `list[str]`) is both injected as the agent's `additional_instructions` and rendered into the judge instructions wherever the `{{criteria}}` placeholder (`CRITERIA_PLACEHOLDER`) appears (`DEFAULT_JUDGE_INSTRUCTIONS` ends with it; custom `instructions` may include it, and it is stripped when no criteria are given). The `should_continue`/`next_message` callables are invoked with keyword args (`iteration`, `last_result`, `messages`, `original_messages`, `session`, `agent`, `progress`, `feedback`) and may be sync or async; declare only what you need plus `**kwargs`. `should_continue` may return a plain `bool` or a `(bool, str | None)` tuple whose second item is feedback surfaced to `next_message`/`record_feedback` via the `feedback` kwarg (the judge uses this to relay its `reasoning`). Stop precedence per iteration is `max_iterations` → `should_continue`, evaluated before `record_feedback` so the feedback is available to it.
- **Feedback tracking** - `record_feedback` captures a per-iteration progress entry (called with the loop kwargs; if it returns a truthy string the entry is appended, otherwise the agent's response text is used as the fallback entry). The accumulated log is exposed to every callback via the `progress` keyword (a per-iteration copy of prior entries) and, when `inject_progress=True` (default), injected into the next iteration's input as a `user` message (the full log without a session, only the latest entry with a session to avoid duplicating history). `fresh_context=True` restarts each iteration from the original task plus the progress log; when a session is attached it is snapshotted (`to_dict()`) before the loop and restored (`from_dict` + field copy) between iterations so the local transcript and any service-side conversation id reset too (in-loop working-state is discarded, pre-loop state preserved, continuity carried only by the progress log).
- **`todos_remaining(provider)`** / **`background_tasks_running(provider)`** - Helper factories returning `should_continue` predicates that loop while a `TodoProvider` has open items, or while a `BackgroundAgentsProvider`'s persisted state shows running tasks.
- **`todos_remaining(*, modes=None)`** / **`todos_remaining_message`** - Helper factories for todo-driven loops (the Python counterpart of .NET's `TodoCompletionLoopEvaluator`), designed for `create_harness_agent` but usable with any agent that registers a `TodoProvider` via `context_providers`. They resolve the `TodoProvider`/`AgentModeProvider` from the *running agent* (`agent.context_providers`, via `_resolve_context_provider`) rather than taking the provider as an argument, so they can be wired directly into `loop_should_continue`/`loop_next_message`. `todos_remaining` returns a `should_continue` predicate that loops while any todo is open; pass `modes=[...]` to gate looping to specific operating modes (case-insensitive; honors the `AgentModeProvider`'s `source_id`/`available_modes`), `modes=None` (default) applies in every mode, and an empty sequence raises `ValueError`. `todos_remaining_message` is a `next_message` callable that lists the still-open todo titles and tells the agent to finish them, returning `None` when the session/agent/provider is unavailable or nothing is open (in which case the middleware's default `None` handling applies: reuse the previous iteration's messages verbatim under the default `fresh_context=False`, or `DEFAULT_NEXT_MESSAGE` only when `fresh_context=True`).
- **`background_tasks_running(provider)`** - Helper factory returning a `should_continue` predicate that loops while a `BackgroundAgentsProvider`'s persisted state shows running tasks (takes the provider explicitly, unlike `todos_remaining`).
- **Approval escape hatch** - `_has_pending_approval_request(result)` checks whether an iteration's response carries a pending tool-approval request (any content with `type == "function_approval_request"`). Both the streaming and non-streaming loops stop and return that response to the caller *before* evaluating `should_continue`/`max_iterations` or injecting `next_message`, so the loop is HITL-safe even when wrapped outermost around a `ToolApprovalMiddleware` (mirrors the C# `LoopAgent`'s `HasPendingApprovalRequests`).
- **Harness integration** - `create_harness_agent` enables the loop when a `loop_should_continue` callable is passed; it prepends `AgentLoopMiddleware(loop_should_continue, max_iterations=loop_max_iterations, next_message=loop_next_message)` ahead of `ToolApprovalMiddleware` so the loop is the outermost middleware (each iteration is a full agent run including tool approval, and the escape hatch hands pending approvals back to the caller). `loop_next_message` and `loop_max_iterations` only take effect together with `loop_should_continue` (with no `loop_should_continue` there is no loop, so they are ignored); `loop_max_iterations` defaults to the loop's default cap (`None` → unbounded).

### Workflows (`_workflows/`)

Expand Down
2 changes: 2 additions & 0 deletions python/packages/core/agent_framework/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -107,6 +107,7 @@
JudgeVerdict,
background_tasks_running,
todos_remaining,
todos_remaining_message,
)
from ._harness._memory import (
DEFAULT_MEMORY_SOURCE_ID,
Expand Down Expand Up @@ -598,6 +599,7 @@
"set_agent_mode",
"step",
"todos_remaining",
"todos_remaining_message",
"tool",
"tool_call_args_match",
"tool_called_check",
Expand Down
31 changes: 31 additions & 0 deletions python/packages/core/agent_framework/_harness/_agent.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@
from .._sessions import ContextProvider, HistoryProvider, InMemoryHistoryProvider
from .._skills import SkillsProvider
from ._background_agents import BackgroundAgentsProvider
from ._loop import DEFAULT_MAX_ITERATIONS, AgentLoopMiddleware
from ._memory import MemoryContextProvider, MemoryStore
from ._mode import AgentModeProvider
from ._todo import TodoProvider
Expand All @@ -35,6 +36,7 @@
from .._compaction import CompactionStrategy, TokenizerProtocol
from .._middleware import MiddlewareTypes
from .._tools import ToolTypes
from ._loop import NextMessageCallable, ShouldContinueCallable
from ._tool_approval import ToolApprovalRuleCallback

logger = logging.getLogger(__name__)
Expand Down Expand Up @@ -254,6 +256,9 @@ def create_harness_agent(
disable_web_search: bool = False,
disable_tool_auto_approval: bool = False,
auto_approval_rules: Sequence[ToolApprovalRuleCallback] | None = None,
loop_should_continue: ShouldContinueCallable | None = None,
loop_next_message: NextMessageCallable | None = None,
loop_max_iterations: int | None = DEFAULT_MAX_ITERATIONS,
otel_provider_name: str | None = None,
context_providers: Sequence[ContextProvider] | None = None,
middleware: Sequence[MiddlewareTypes] | None = None,
Expand All @@ -273,6 +278,7 @@ def create_harness_agent(
- **BackgroundAgentsProvider** — delegate work to background sub-agents
- **Tool approval** — "don't ask again" standing approval rules plus heuristic
auto-approval callbacks
- **Looping** — re-run the agent until a ``should_continue`` predicate is satisfied
- **OpenTelemetry** — observability via ``AgentTelemetryLayer``

Each feature can be disabled or customized via keyword arguments.
Expand Down Expand Up @@ -380,6 +386,19 @@ def create_harness_agent(
content and returns ``True`` to approve it. Rules are evaluated after standing rules
(derived from prior user approvals) but before prompting the user. Only used when
``disable_tool_auto_approval`` is False.
loop_should_continue: Optional predicate that enables the looping middleware. When provided, the
agent is re-run in a loop (via :class:`~agent_framework.AgentLoopMiddleware`, wired as
the outermost middleware so each iteration is a full agent run including tool approval)
for as long as the predicate returns ``True``, up to ``loop_max_iterations``. If an
iteration returns a pending tool-approval request, the loop stops and returns it so the
caller can approve before continuing. When None (default), no loop is added.
loop_next_message: Optional callable controlling the input for the next loop iteration.
Only takes effect when ``loop_should_continue`` is set (otherwise no loop is added and
this is ignored).
loop_max_iterations: Safety cap on the number of loop iterations. ``None`` means unbounded;
a positive integer caps the loop (defaults to the loop middleware's default cap). Only
takes effect when ``loop_should_continue`` is set (otherwise no loop is added and this
is ignored).
otel_provider_name: Custom OpenTelemetry provider/source name for telemetry.
context_providers: Additional context providers to include after the built-in ones.
middleware: Additional middleware to include.
Expand Down Expand Up @@ -475,9 +494,21 @@ def create_harness_agent(
# placed first so it sits outermost: it intercepts inbound "always approve" responses and
# outbound approval requests at the caller boundary, and its re-invocation loop re-runs any
# user-supplied middleware. ToolApprovalMiddleware requires an AgentSession at run time.
# When should_continue is supplied, the loop is prepended ahead of tool approval so it sits
# outermost of all: each loop iteration is a full agent run (including tool approval), and the
# loop's approval escape hatch returns any pending approval request to the caller.
assembled_middleware: list[MiddlewareTypes] = []
if not disable_tool_auto_approval:
assembled_middleware.append(ToolApprovalMiddleware(auto_approval_rules=auto_approval_rules))
if loop_should_continue is not None:
assembled_middleware.insert(
0,
AgentLoopMiddleware(
loop_should_continue,
max_iterations=loop_max_iterations,
next_message=loop_next_message,
),
)
if middleware:
assembled_middleware.extend(middleware)

Expand Down
Loading
Loading