Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions src/content/docs/agent-platform/cloud-agents/faqs.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ The cloud agents platform supports self-hosting the **agent sandbox** (the execu
Self-hosted execution is available on **Enterprise** plans. See [Self-hosting](/agent-platform/cloud-agents/self-hosting/) and [Deployment patterns](/agent-platform/cloud-agents/deployment-patterns/) for details.

:::note
[Bring Your Own Key (BYOK)](/agent-platform/inference/bring-your-own-api-key/) does not apply to cloud agents. BYOK keys are stored locally on your device and cannot be passed to cloud-hosted or self-hosted agent runs. All cloud agent runs consume [Warp credits](/support-and-community/plans-and-billing/credits/).
[Bring Your Own API Key (BYOK)](/agent-platform/inference/bring-your-own-api-key/) does not apply to cloud agents. BYOK keys are stored locally on your device and cannot be passed to cloud-hosted or self-hosted agent runs. All cloud agent runs consume [Warp credits](/support-and-community/plans-and-billing/credits/).
:::

## Models
Expand All @@ -84,9 +84,9 @@ We're strong proponents of this, but it ultimately depends on model provider pol

### Do you support local or private LLMs for compliance or air-gapped environments?

Enterprise plans will support managed integrations like AWS Bedrock and Google Vertex.
Enterprise plans can route inference through your own cloud-provider account via [Bring Your Own LLM (BYOLLM)](/enterprise/enterprise-features/bring-your-own-llm/), so prompts stay within your cloud environment.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] [SECURITY] This says cloud agents can route through BYOLLM and that prompts stay in the customer's cloud, but the linked BYOLLM page still states cloud agents do not yet support BYOLLM routing; either update the BYOLLM/security docs in this PR or keep this FAQ in the coming-soon state so compliance guidance is not contradictory.


Fully local, offline LLM execution is difficult given the current cloud agents orchestration and runtime architecture, but private-model support via enterprise cloud providers is on the roadmap.
Fully local, offline LLM execution is difficult given the current cloud agents orchestration and runtime architecture, but private-model support via enterprise cloud providers is available through BYOLLM.

### Will cloud agents support Agent-to-Agent Protocols (A2A)?

Expand Down
3 changes: 1 addition & 2 deletions src/content/docs/agent-platform/cloud-agents/overview.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -110,7 +110,7 @@ If your team also uses Warp's terminal, you get an additional workflow: tasks la
Cloud agents and [integrations](/agent-platform/cloud-agents/integrations/) run on the [Oz Platform](/agent-platform/cloud-agents/platform/) control plane, and usage is billed using credits.

:::note
[Bring Your Own Key (BYOK)](/agent-platform/inference/bring-your-own-api-key/) is not supported for cloud agent runs. BYOK keys are stored locally on your device and are not accessible to cloud-hosted agents. All cloud agent runs consume Warp credits.
[Bring Your Own API Key (BYOK)](/agent-platform/inference/bring-your-own-api-key/) is not supported for cloud agent runs. BYOK keys are stored locally on your device and are not accessible to cloud-hosted agents. All cloud agent runs consume Warp credits.
:::

#### For cloud agents via CLI/API
Expand All @@ -127,7 +127,6 @@ Integrations require you to be part of a [Warp team](/knowledge-and-collaboratio

* **Plan requirements**
* **Supported plans**: Build, Max, Business
* Not supported: Pro, Turbo, Lightspeed, legacy Business
* Your plan must support add-on credits.
* **Credit requirements**
* Your team must have at least 20 credits available to run cloud agents and integrations.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -81,8 +81,8 @@ Your team must meet the following requirements to run integrations:

When a user triggers an agent through an integration (like Slack or Linear), the run draws from credits based on who the run is billed to:

* **User-triggered runs on Build, Max, or Business** - Warp draws from any [cloud agent credits](/support-and-community/plans-and-billing/credits/#compute-credits) the user has, then the user's plan-included credits, then the user's Add-on credits. Add-on credits are scoped to the individual user and are not shared across the team.
* **Team API key or scheduled cloud agent runs on Build, Max, or Business** - Warp bills the team owner. The waterfall is: the owner's plan-included credits, then the owner's Add-on credits. With auto-reload off, the request is blocked when both pools are depleted. With auto-reload on, usage can trigger a reload on the owner's pool subject to the team-wide monthly spend cap.
* **User-triggered runs on Build, Max, or Business** - Warp draws from any [cloud agent credits](/support-and-community/plans-and-billing/credits/#compute-credits) the user has, then the user's plan-included credits, then the user's add-on credits. Add-on credits are scoped to the individual user and are not shared across the team.
* **Team API key or scheduled cloud agent runs on Build, Max, or Business** - Warp bills the team owner. The waterfall is: the owner's plan-included credits, then the owner's add-on credits. With auto-reload off, the request is blocked when both pools are depleted. With auto-reload on, usage can trigger a reload on the owner's pool subject to the team-wide monthly spend cap.
* **Enterprise plans** - Runs draw from the team-scoped credit pool, per your Enterprise contract terms.

If all applicable credit sources are exhausted and no auto-reload is configured, integrations and cloud agents will not run until credits are added. See [add-on credits](/support-and-community/plans-and-billing/add-on-credits/) for the full self-serve waterfall and [platform credits](/support-and-community/plans-and-billing/platform-credits/) for the third bucket that applies to every cloud agent run.
Expand Down Expand Up @@ -216,13 +216,13 @@ How credits are consumed depends on how the agent run is triggered and authentic
**User-triggered runs** (CLI with personal API key, Slack, Linear, or the Warp app):

* Runs are tied to the triggering user's identity.
* On Build, Max, and Business plans, credits are consumed starting with any [cloud agent credits](/support-and-community/plans-and-billing/credits/#compute-credits) allocated to the user, then the user's plan-included credits, then the user's Add-on credits. Add-on credits are scoped to the individual user.
* On Build, Max, and Business plans, credits are consumed starting with any [cloud agent credits](/support-and-community/plans-and-billing/credits/#compute-credits) allocated to the user, then the user's plan-included credits, then the user's add-on credits. Add-on credits are scoped to the individual user.
* On Enterprise plans, runs draw from the team-scoped credit pool, per your Enterprise contract terms.

**Team API key and scheduled cloud agent runs** (fully automated or headless workflows):

* Runs are not tied to any individual user.
* On Build, Max, and Business plans, Warp bills the team owner: the owner's plan-included credits, then the owner's Add-on credits. With auto-reload off, the request is blocked when both pools are depleted. With auto-reload on, usage can trigger a reload on the owner's Add-on credit pool subject to the team-wide monthly spend cap.
* On Build, Max, and Business plans, Warp bills the team owner: the owner's plan-included credits, then the owner's add-on credits. With auto-reload off, the request is blocked when both pools are depleted. With auto-reload on, usage can trigger a reload on the owner's add-on credit pool subject to the team-wide monthly spend cap.
* On Enterprise plans, these runs draw from the team-scoped credit pool, per your Enterprise contract terms.
* Ideal for CI/CD pipelines, scheduled tasks, and other automated workflows.
* For workflows that require code changes (opening pull requests, pushing branches, or writing to a repository), configure [team GitHub authorization](#team-github-authorization) so the agent can authenticate with the Oz by Warp GitHub App. Alternatively, use a [personal API key](/reference/cli/api-keys/) to authenticate as an individual user.
Expand All @@ -244,7 +244,7 @@ All triggers and instructions used by cloud agents are defined and controlled by

Because triggers and instructions are configured by your team, the credits consumed when an agent runs are billed according to the model above:

* **Build, Max, Business** - User-triggered runs draw from the triggering user's pools (plan-included credits, then their Add-on credits). Team API key and scheduled cloud agent runs are billed to the team owner (the owner's plan-included credits, then the owner's Add-on credits, subject to the team-wide spend cap when auto-reload is on).
* **Build, Max, Business** - User-triggered runs draw from the triggering user's pools (plan-included credits, then their add-on credits). Team API key and scheduled cloud agent runs are billed to the team owner (the owner's plan-included credits, then the owner's add-on credits, subject to the team-wide spend cap when auto-reload is on).
* **Enterprise** - All runs draw from the team-scoped credit pool, per your Enterprise contract terms.

It's the team's responsibility to manage triggers, confirm they behave as intended, and monitor usage. Reviewing triggers, prompts, and agent behavior periodically helps ensure that credit usage aligns with expectations.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ They are compatible with any Linux x86-64 image that includes a `bash` shell and

The resources available to Warp-hosted agents depend on your [plan](https://www.warp.dev/pricing) - see the latest details there.

On [enterprise](/enterprise) plans, resources are configurable up to 32 vCPUs and 64 GiB of memory. If additional resources are required, reach out to Warp support about custom provisioning.
On [Enterprise](/enterprise) plans, resources are configurable up to 32 vCPUs and 64 GiB of memory. If additional resources are required, reach out to Warp support about custom provisioning.

### Concurrency

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,9 +21,9 @@ Warp offers three ways to bring your own AI infrastructure. Use this table to pi

| Name | Meaning | Plans |
| --- | --- | --- |
| **Bring your own API key** (BYOK) | Use your own API key for OpenAI, Anthropic, or Google models. Keys are stored locally on your device. | Free and all eligible paid plans |
| **Bring Your Own API Key** (BYOK) | Use your own API key for OpenAI, Anthropic, or Google models. Keys are stored locally on your device. | Free and all eligible paid plans |
| **[Custom inference endpoint](/agent-platform/inference/custom-inference-endpoint/)** | Connect Warp to an OpenAI-compatible endpoint such as OpenRouter, LiteLLM, z.ai, or an internal gateway. | Free and all eligible paid plans |
| **[Bring your own LLM](/enterprise/enterprise-features/bring-your-own-llm/)** (BYOLLM) | Enterprise-managed inference through your cloud provider (AWS Bedrock today; Azure Foundry and Google Vertex coming soon), with Warp handling routing, orchestration, governance, and observability. | Enterprise only |
| **[Bring Your Own LLM](/enterprise/enterprise-features/bring-your-own-llm/)** (BYOLLM) | Enterprise-managed inference through your cloud provider (AWS Bedrock today; Azure Foundry and Google Vertex coming soon), with Warp handling routing, orchestration, governance, and observability. | Enterprise only |

See [warp.dev/pricing](https://www.warp.dev/pricing) for current plan availability.

Expand Down Expand Up @@ -125,19 +125,19 @@ However, when you use your own API key:

Warp itself never stores your LLM API keys.

### BYOK on Enterprise and Business plans
### BYOK on Business and Enterprise plans

BYOK is configured at the **user level** on every plan, including Enterprise and Business:
BYOK is configured at the **user level** on every plan, including Business and Enterprise:

* Each team member adds and manages their own API keys locally on their device.
* Centrally configured, admin-managed BYOK is not yet available — admins cannot enforce or share API keys across team members from a single place.
* There is no organization-level Admin Panel for BYOK management today.

If your organization needs centrally managed model routing today, see [Bring your own LLM](/enterprise/enterprise-features/bring-your-own-llm/) for the Enterprise-managed option, or [contact sales](https://www.warp.dev/contact-sales).
If your organization needs centrally managed model routing today, see [Bring Your Own LLM](/enterprise/enterprise-features/bring-your-own-llm/) for the Enterprise-managed option, or [contact sales](https://www.warp.dev/contact-sales).

## Related resources

* [Custom inference endpoint](/agent-platform/inference/custom-inference-endpoint/) — Route Warp through any OpenAI-compatible endpoint, such as OpenRouter, LiteLLM, z.ai, or an internal gateway.
* [Bring your own LLM](/enterprise/enterprise-features/bring-your-own-llm/) — Enterprise-managed inference through your cloud provider or approved infrastructure.
* [Bring Your Own LLM](/enterprise/enterprise-features/bring-your-own-llm/) — Enterprise-managed inference through your cloud provider or approved infrastructure.
* [Model Choice](/agent-platform/inference/model-choice/) — Full list of supported models and `model_id` values.
* [Credits](/support-and-community/plans-and-billing/credits/) — How Warp credits work and when they're consumed.
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ To enable and configure a custom inference endpoint:

When you explicitly select an endpoint-routed model from the model picker, Warp routes the request through your endpoint instead of consuming Warp's AI credits.

The configuration flow mirrors the [Bring your own API key](/agent-platform/inference/bring-your-own-api-key/) setup, so the steps will feel familiar if you've already configured BYOK.
The configuration flow mirrors the [Bring Your Own API Key](/agent-platform/inference/bring-your-own-api-key/) setup, so the steps will feel familiar if you've already configured BYOK.

## Billing behavior

Expand Down Expand Up @@ -90,23 +90,23 @@ Review your endpoint provider's data handling and retention policies before rout

Custom inference endpoints are configured at the **user level** on every plan. Each user adds their own endpoint locally; centrally configured, admin-managed endpoints for teams are not yet available.

Enterprise teams that need centrally managed model routing today should see [Bring your own LLM](/enterprise/enterprise-features/bring-your-own-llm/).
Enterprise teams that need centrally managed model routing today should see [Bring Your Own LLM](/enterprise/enterprise-features/bring-your-own-llm/).

## How custom inference endpoints differ from BYOK and BYOLLM

Warp offers three ways to bring your own AI infrastructure. Use this table to pick the right one, and follow the links for full details.

| Name | Meaning | Plans |
| --- | --- | --- |
| **[Bring your own API key](/agent-platform/inference/bring-your-own-api-key/)** (BYOK) | Use your own API key for OpenAI, Anthropic, or Google models. Keys are stored locally on your device. | Free and all eligible paid plans |
| **[Bring Your Own API Key](/agent-platform/inference/bring-your-own-api-key/)** (BYOK) | Use your own API key for OpenAI, Anthropic, or Google models. Keys are stored locally on your device. | Free and all eligible paid plans |
| **Custom inference endpoint** | Connect Warp to an OpenAI-compatible endpoint such as OpenRouter, LiteLLM, z.ai, or an internal gateway. | Free and all eligible paid plans |
| **[Bring your own LLM](/enterprise/enterprise-features/bring-your-own-llm/)** (BYOLLM) | Enterprise-managed inference through your cloud provider (AWS Bedrock today; Azure Foundry and Google Vertex coming soon), with Warp handling routing, orchestration, governance, and observability. | Enterprise only |
| **[Bring Your Own LLM](/enterprise/enterprise-features/bring-your-own-llm/)** (BYOLLM) | Enterprise-managed inference through your cloud provider (AWS Bedrock today; Azure Foundry and Google Vertex coming soon), with Warp handling routing, orchestration, governance, and observability. | Enterprise only |

Platform credits may apply for local agent runs on Business and Enterprise when using BYOK, a custom inference endpoint, or BYOLLM. See [platform credits](/support-and-community/plans-and-billing/platform-credits/).

## Related resources

* [Bring your own API key](/agent-platform/inference/bring-your-own-api-key/) — Use your own OpenAI, Anthropic, or Google API keys.
* [Bring your own LLM](/enterprise/enterprise-features/bring-your-own-llm/) — Enterprise-managed inference through your cloud provider or approved infrastructure.
* [Bring Your Own API Key](/agent-platform/inference/bring-your-own-api-key/) — Use your own OpenAI, Anthropic, or Google API keys.
* [Bring Your Own LLM](/enterprise/enterprise-features/bring-your-own-llm/) — Enterprise-managed inference through your cloud provider or approved infrastructure.
* [Model Choice](/agent-platform/inference/model-choice/) — Full list of supported models and `model_id` values.
* [Credits](/support-and-community/plans-and-billing/credits/) — How Warp credits work and when they're consumed.
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: Bring your own LLM
title: Bring Your Own LLM
description: >-
Route Warp's agents through your AWS Bedrock models for billing control and
infrastructure flexibility.
Expand Down Expand Up @@ -33,7 +33,7 @@ When BYOLLM is enabled, Warp redirects inference calls to your AWS Bedrock envir

Here's the high-level flow:

1. **Admin configures routing** - Your team admin sets routing policies in Warp's admin settings (e.g., "Route Claude Sonnet 4.5 through AWS Bedrock; disable direct Anthropic API").
1. **Admin configures routing** - Your team admin sets routing policies in Warp's admin settings (e.g., "Route Claude Opus 4.7 through AWS Bedrock; disable direct Anthropic API").
2. **Team members authenticate** - Each team member authenticates to AWS locally using the AWS CLI (`aws login`).
3. **Warp routes requests** - When a team member uses an interactive agent in the terminal, Warp uses their short-lived session credentials to authenticate requests to your configured AWS Bedrock API endpoint.
4. **Inference executes in your cloud** - The model runs in your AWS account. Responses return to the Warp client.
Expand Down Expand Up @@ -74,7 +74,7 @@ Before configuring BYOLLM, confirm the following:
In the [Admin Panel](/enterprise/team-management/admin-panel/), configure which models should route through AWS Bedrock:

1. From the [Admin Panel](/enterprise/team-management/admin-panel/), navigate to the BYOLLM or model routing settings.
2. Select which models should use your cloud provider (e.g., "Claude Sonnet 4.5 via AWS Bedrock").
2. Select which models should use your cloud provider (e.g., "Claude Opus 4.7 via AWS Bedrock").
3. Optionally, disable direct API access to enforce provider-only routing.

### Step 2: Provision IAM roles (cloud admin)
Expand Down Expand Up @@ -142,7 +142,7 @@ Warp's agents automatically select the best model for your task while respecting

If a BYOLLM request fails (e.g., due to expired credentials, insufficient permissions, or provider quota limits), Warp attempts to fall back to the next available model your admin has enabled.

For example, if Claude Sonnet 4.5 on Bedrock fails but your admin also enabled it via direct API, Warp falls back to the direct API to avoid disruption. If a fallback uses a direct API model, that request consumes Warp credits.
For example, if Claude Opus 4.7 on Bedrock fails but your admin also enabled it via direct API, Warp falls back to the direct API to avoid disruption. If a fallback uses a direct API model, that request consumes Warp credits.

If no fallback is available (e.g., the admin disabled all non-Bedrock models), Warp displays a clear error message.

Expand Down Expand Up @@ -189,7 +189,7 @@ However, when using BYOLLM:

### How is BYOLLM different from BYOK?

**BYOK (Bring Your Own Key)** lets individual users add their own API keys for direct model provider access (e.g., Anthropic, OpenAI, Google). Warp stores keys locally on the user's device.
**BYOK (Bring Your Own API Key)** lets individual users add their own API keys for direct model provider access (e.g., Anthropic, OpenAI, Google). Warp stores keys locally on the user's device.

**BYOLLM (Bring Your Own LLM)** routes inference through your organization's cloud infrastructure (AWS Bedrock) using cloud-native IAM. Admins configure it at the admin level and it applies to the entire team.

Expand Down
Loading
Loading