Three flag-gated, independent inference-time reliability features for the agent loop, each default-off and pure/unit-tested:
- Constrained (grammar) decoding for tool calls (
provider/constrained.ts) — builds a JSON-Schema envelope for a valid tool call so a local model (vLLM/LM Studio/llama.cpp) can be forced at the token level to emit a parseable, schema-correct call. Deterministic fix for unparseable tool calls; base-model-agnostic.
- Tool retrieval (
tool/retrieval.ts) — with ~78 tools, sending the full set every turn floods context and hurts tool selection. Picks a relevant per-turn subset (always-on core + lexically-ranked top-k), never dropping a tool referenced mid-trajectory. v1 lexical (dependency-free, deterministic).
- Pre-execution critic gate (
tool/critic.ts) — before a side-effecting tool runs, a pluggable Verifier checks the proposed args; on hard failure the call is denied with a reason fed back for retry. Default verifier allows everything (ungated); a real verifier is injected by the caller.
Wired (flag-gated) into session/llm.ts. All three are independently toggleable and off by default.
Three flag-gated, independent inference-time reliability features for the agent loop, each default-off and pure/unit-tested:
provider/constrained.ts) — builds a JSON-Schema envelope for a valid tool call so a local model (vLLM/LM Studio/llama.cpp) can be forced at the token level to emit a parseable, schema-correct call. Deterministic fix for unparseable tool calls; base-model-agnostic.tool/retrieval.ts) — with ~78 tools, sending the full set every turn floods context and hurts tool selection. Picks a relevant per-turn subset (always-on core + lexically-ranked top-k), never dropping a tool referenced mid-trajectory. v1 lexical (dependency-free, deterministic).tool/critic.ts) — before a side-effecting tool runs, a pluggableVerifierchecks the proposed args; on hard failure the call is denied with a reason fed back for retry. Default verifier allows everything (ungated); a real verifier is injected by the caller.Wired (flag-gated) into
session/llm.ts. All three are independently toggleable and off by default.