Skip to content

Validate orchestration model IDs client-side before spawning agents#13269

Draft
dmichelin wants to merge 12 commits into
masterfrom
dan/orch-model-validation
Draft

Validate orchestration model IDs client-side before spawning agents#13269
dmichelin wants to merge 12 commits into
masterfrom
dan/orch-model-validation

Conversation

@dmichelin

Copy link
Copy Markdown
Contributor

Description

Since warp-server #11971, the server validates the Oz model_id at cloud-task creation time (before spawning a sandbox). As a result, orchestrating with an invalid, custom, or plan-unavailable model surfaced as opaque per-child failures (or a 30s spawn timeout) instead of a clear, actionable message.

This validates the run-wide orchestration model_id client-side, before dispatching children, against the models Oz actually accepts for cloud tasks:

  • Cloud vs. local source of truth: cloud/remote Oz runs validate against the raw server agent-mode catalog (the same set AddTask and GET /api/v1/agent/models use), so models valid only for local runs (custom endpoints / local routers) are no longer mistaken for Oz-available. Local runs keep the broader local catalog. No new server endpoint was needed — the existing one already exposes the Oz set.
  • New setting OrchestrationInvalidModelBehavior (default Block): when Block, orchestration is blocked with a clear error (Accept disabled in the confirmation and plan cards; a retryable Failure in the autonomous path) that tells the user they can switch to auto-select. When AutoSelect, a valid model (the Oz default) is substituted and the run proceeds. Exposed in Settings → Agents and the Command Palette.
  • The model catalog is refreshed when the confirmation card becomes actionable, and validation fails open until the catalog has loaded (so early/autonomous runs are never falsely blocked).

How

  • LLMPreferences gains raw Oz-cloud accessors + a server_models_loaded flag (app/src/ai/llms.rs).
  • A shared unavailable_model_reason(...) helper in orchestration_controls.rs, wired into the Accept gate (accept_disabled_reason_with_auth, shared by the RunAgents card and the plan card) and the executor pre-flight (RunAgentsExecutor::dispatch_prepared_run_agents).
  • New setting + Command Palette entries + a settings-page row.

Linked Issue

  • The linked issue is labeled ready-to-spec or ready-to-implement.
  • Where appropriate, screenshots or a short video of the implementation are included below (especially for user-visible or UI changes).

Testing

Added unit tests:

  • Executor gate (run_agents_tests.rs): invalid cloud model + BlockRunAgentsResult::Failure, zero children dispatched; + AutoSelect → proceeds with a substituted model; empty / auto model → proceeds.
  • Accept gate / auto-select (orchestration_controls_tests.rs): Block → Accept disabled with the reason; AutoSelect → not disabled and model substituted; valid/empty/auto → allowed; a local-only custom model is allowed for local runs but flagged for cloud.

Validation on the integrated branch: cargo fmt clean, cargo clippy -p warp --all-targets -- -D warnings clean, cargo nextest run -p warp run_agents orchestration_controls llms → 85 passed. Full presubmit / integration suite runs in CI.

  • I have manually tested my changes locally with ./script/run

Agent Mode

  • Warp Agent Mode - This PR was created via Warp's AI Agent Mode

Conversation: https://staging.warp.dev/conversation/90ec1637-f933-4b56-9a18-d4a01252d209
Plan: https://staging.warp.dev/drive/notebook/BvsDuGFDR7GeWcgewlOY8x

CHANGELOG-IMPROVEMENT: Orchestration now checks your selected model against the models available for cloud agents before launching, and either blocks with a clear message or auto-selects a valid model (configurable in Settings → Agents).

Co-Authored-By: Oz oz-agent@warp.dev

dmichelin and others added 7 commits June 30, 2026 17:22
Add the shared primitives for validating orchestration model IDs client-side:
- LLMPreferences accessors for the raw Oz cloud agent-mode catalog
  (is_oz_cloud_agent_model_available, oz_cloud_default_agent_model_id,
  oz_cloud_agent_model_suggestions, oz_cloud_agent_model_catalog_loaded)
  plus a server_models_loaded flag for fail-open behavior.
- OrchestrationInvalidModelBehavior setting (Block default / AutoSelect).
- unavailable_model_reason() helper in orchestration_controls, validating
  cloud/remote Oz runs against the raw Oz set and local/non-Oz runs against
  the existing harness-filtered choices.

Call sites (executor gate, card gate/auto-select, settings UI) land in
follow-up commits.

Co-Authored-By: Oz <oz-agent@warp.dev>
The foundation added server_models_loaded to LLMPreferences but missed the
direct struct literal in llms_tests.rs, breaking the lib-test build (E0063).

Co-Authored-By: Oz <oz-agent@warp.dev>
Pre-flight model check in RunAgentsExecutor::dispatch_prepared_run_agents
(after validate_request): Block => RunAgentsResult::Failure with zero children
dispatched; AutoSelect => substitute the Oz cloud default model and proceed.
Adds executor tests.

Co-Authored-By: Oz <oz-agent@warp.dev>
Accept gate blocks unavailable models under Block and auto-selects a valid
model under AutoSelect (shared by the RunAgents card and plan card); refresh
the model catalog when the card becomes actionable; add the
OrchestrationInvalidModelBehavior settings row + command-palette entries and
context flags. Adds gate tests.

Co-Authored-By: Oz <oz-agent@warp.dev>
@cla-bot cla-bot Bot added the cla-signed label Jul 1, 2026
dmichelin and others added 5 commits July 1, 2026 14:16
…llback model

The invalid-model behavior now reads as "Block the run" / "Use a fallback
model". When using a fallback, a new "Fallback model" picker (shown only in
that mode) lets the user choose which model to substitute, backed by the
agents.warp_agent.other.orchestration_fallback_model_id setting. The executor
and card auto-select paths resolve the configured fallback (validated against
the Oz cloud catalog) and fall back to the Oz default when unset or invalid.

Co-Authored-By: Oz <oz-agent@warp.dev>
The card's pre-existing picker-reset paths (apply_execution_mode_change,
repopulate_all_pickers) reset an out-of-catalog model to a valid default
unconditionally, which rewrote an invalid run-wide model to 'auto' before the
accept gate could block. Gate those resets on AutoSelect so that under Block the
invalid model is left in place and accept_disabled_reason_with_auth surfaces the
error (Accept disabled). AutoSelect behavior is unchanged.

Co-Authored-By: Oz <oz-agent@warp.dev>
dispatch_prepared_run_agents' AutoSelect branch substituted the Oz cloud
default unconditionally, which could set an Oz model id on a non-Oz remote
harness (claude-code/codex) or a local codex child. Validate the resolved
fallback against the target harness (mirroring maybe_auto_select_valid_model)
and inherit the default (empty) when it doesn't apply.

Co-Authored-By: Oz <oz-agent@warp.dev>
The invalid-model gate previously covered only multi-agent orchestration
(run_agents). Single cloud-agent launches (local->cloud '&' handoff, run in
cloud) spawn via AmbientAgentViewModel::start_spawn_stream, which bypassed it and
silently fell back to auto. Add the same pre-spawn gate there for the Oz base
model: Block -> surface a failure and don't spawn; AutoSelect -> substitute the
configured fallback (or inherit). Guard spawn_internal from overwriting the
Failed state. Non-Oz harnesses keep managing their own models.

Co-Authored-By: Oz <oz-agent@warp.dev>
Generalize the setting now that it governs all cloud-agent launches (multi-agent
orchestration + single-agent handoff/run-in-cloud), not just orchestration:
- OrchestrationInvalidModelBehavior -> CloudAgentInvalidModelBehavior
- OrchestrationFallbackModelId -> CloudAgentFallbackModelId
- toml keys agents.warp_agent.other.cloud_agent_invalid_model_behavior /
  cloud_agent_fallback_model_id
- context flags CLOUD_AGENT_INVALID_MODEL_{BLOCK,AUTO_SELECT}
- settings row + command-palette wording ("cloud model" / "cloud agent model")
Pure rename; behavior unchanged.

Co-Authored-By: Oz <oz-agent@warp.dev>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant