Python-native coding agent — one run_python boundary. Every external action is
auditable, replayable, and interruptible.
uv-agent channels all model capabilities through a single, well-defined exit: the
model can only touch the outside world via run_python. Each call is a complete
Python script executed in a uv run-managed isolated environment, using
uv_agent_runtime helpers for file editing, command execution, code search, MCP,
sub-agents, images, and more. With only one exit, you can replay any run and see
exactly what happened and why.
The project is still experimental. Public APIs, config fields, and runtime behavior may change.
- Single tool boundary — no shell, filesystem, browser, or MCP model tools. The model writes Python; the managed runtime executes it. Every external action is an auditable script.
- Cache-aware NetGain compaction — long conversations no longer trigger blind compression. A pre-turn lightweight judge round lets the model estimate remaining calls and history dependency, then computes the net gain of compaction via an economic formula. Compression fires only when cache savings outweigh information loss. Recent context is retained verbatim (K tokens) to avoid losing key details.
- Python managed runtime — scripts run in a project-shared
uvenvironment.uv_agent_runtimeprovides helpers for read/write/edit, ripgrep search, subprocesses, dependency installation, sub-agents, MCP clients, and more. Scripts serve as documentation — no opaque shell commands. - Plugin system — plain Python packages discovered via
uv_agent.pluginsentry point. Register runtime helpers, subscribe to events, submit turns from external systems.uvx --with your-plugin uv-agentand you're set. - Self-bootstrapping — uv-agent is developed using uv-agent. Reading, editing, testing, and iterating on the project are done with uv-agent itself.
- Progressive context disclosure — skills, MCP servers, and workspace rules are not dumped into the prompt all at once. The model receives an index first; full content is disclosed only when needed. Removed capabilities are explicitly marked to prevent stale-context errors.
- Goal mode durable memory —
/goalcreates a per-thread checklist/notes layer independent of the chat transcript. After compaction or resume, the model consults Goal files rather than relying solely on summarized history. - Agent View parallel workspaces — dispatch bug investigations, implementation experiments, and test fixes to isolated Git worktree background sessions. Track them all from a single dashboard.
- Prompt-cache-friendly design — the system prompt prefix is guaranteed byte-identical within an epoch. Compaction requests share the same prefix structure as normal calls, maximizing provider-side cache hits. Cache reads are nearly free.
The cache-aware compaction introduced in v0.16.0 is uv-agent's core optimization for long-running sessions. Unlike traditional "compress when context hits N%," uv-agent makes an economic decision before every turn:
- The model estimates how many more conversation rounds are needed
(
remaining_calls_bucket) and how strongly the task depends on history (history_dependency). - It enumerates K retention candidates and evaluates the NetGain for each: future cache savings minus compaction call cost, cache invalidation loss, information distortion penalty, plus context quality improvement gain.
- Compaction fires only when the best net gain exceeds a margin-scaled threshold; otherwise it skips, avoiding wasted compression for short tasks.
Compaction requests share the exact same prefix structure as normal calls (system prompt → tools → messages), ensuring provider-side prompt prefix caches stay warm. Over 90% of input tokens in a typical compaction call are billed at cached rates (typically 1%–10% of the normal input price).
This design draws on the DP compaction algorithm from bash-agent, with thanks.
Prerequisites:
- uv — https://docs.astral.sh/uv/getting-started/installation/
- ripgrep — https://github.com/BurntSushi/ripgrep#installation
- Git — needed for normal coding workflows and Worktree mode.
# Run the latest published release
uvx uv-agent@latest
# Run from a local checkout
uv run uv-agent
# Single-turn question (no TUI)
uvx uv-agent@latest ask "Summarize the project structure"
# Resume an existing thread
uvx uv-agent@latest ask --thread thr_xxx "Continue where we left off"uv-agent ships with no real provider configuration. Configure at least one provider,
model, and level in ~/.uv-agent/config.json (or project-level
.uv-agent/config.json). Keep API keys in environment variables or git-ignored
local config.
Supported API formats:
api value |
Format |
|---|---|
"responses" |
OpenAI Responses API |
"chat_completions" |
OpenAI Chat Completions API |
"anthropic_messages" |
Anthropic Messages API |
Full configuration example
{
"providers": {
"deepseek": {
"base_url": "https://api.deepseek.com",
"api_key": "sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
"timeout_s": 7200,
"chat_completions": {
"path": "/chat/completions"
},
"message_passthrough": {
"assistant": [
"reasoning_content"
]
},
"reasoning_display": {
"assistant_message_fields": [
"reasoning_content"
],
"stream_delta_fields": [
"reasoning_content"
]
}
},
"minimax": {
"base_url": "https://api.minimaxi.com",
"api_key": "sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
"timeout_s": 7200,
"chat_completions": {
"path": "/v1/chat/completions"
},
"anthropic_messages": {
"path": "/anthropic/v1/messages"
}
}
},
"models": {
"deepseek-v4-flash": {
"provider": "deepseek",
"model": "deepseek-v4-flash",
"api": "chat_completions",
"supports_images": false,
"context_window_tokens": 1000000,
"params": {
"reasoning_effort": "high"
}
},
"deepseek-v4-pro": {
"provider": "deepseek",
"model": "deepseek-v4-pro",
"api": "chat_completions",
"supports_images": false,
"context_window_tokens": 1000000,
"params": {
"reasoning_effort": "max"
}
},
"MiniMax-M2.7": {
"provider": "minimax",
"model": "MiniMax-M2.7-highspeed",
"api": "anthropic_messages",
"supports_images": false,
"context_window_tokens": 204800
}
},
"levels": {
"deepseek-flash": {
"model": "deepseek-v4-flash"
},
"deepseek-pro": {
"model": "deepseek-v4-pro"
},
"MiniMax-M2.7": {
"model": "MiniMax-M2.7"
}
},
"runtime": {
"default_level": "deepseek-flash",
"ask_default_level": "deepseek-flash",
"store_provider_response": false,
"max_agent_rounds": 1000,
"compression": {
"enabled": true,
"model_level": "deepseek-flash",
"trigger_ratio": 0.9
},
"title_generation": {
"enabled": true,
"model_level": "deepseek-flash"
},
"branch_name_generation": {
"enabled": true,
"model_level": "deepseek-flash",
"timeout_s": 15.0
}
},
"runner": {
"default_timeout_s": 7200,
"max_output_bytes": 1000000,
"scriptenv_index_url": null
},
"pricing": {
"currency": "RMB",
"unit": "1M_tokens",
"models": {
"deepseek-v4-flash": {
"input": 1,
"output": 2,
"cached_input": 0.02
},
"deepseek-v4-pro": {
"input": 3,
"output": 6,
"cached_input": 0.025
}
}
},
"ui": {
"completion_notification": {
"enabled": true
}
},
"plugins": {
"disabled": [],
"config": {}
}
}
Use /config in the TUI to switch default level, language, and compression
settings. See configuration for every option and
config.example.json for a standalone example.
- Type and press
Enterto send. UseCtrl+Enter/Ctrl+Jfor newlines. - Type
/from an empty composer to open the command palette; type to filter.@for file mentions,@@for thread mentions. /level <name>to switch models;/statusto inspect runtime state including cache compaction judge details./goal enable [objective]for durable task checklists across long sessions.- Agent View dispatches background tasks to isolated Git worktrees.
- Use
uv-agent tuifor the legacy Textual panels (/config,/models, etc.).
See TUI and slash commands for the full list.
- tui2 (default,
uv-agentoruv-agent tui2) — lightweight ANSI TUI rendered directly in the terminal. Compact status rows, streaming events, Goal/Worktree mode, and image attachments. - Textual TUI (
uv-agent tui) — deprecated, kept for compatibility. The older Textual widget-based interface. Screenshot: docs/t1.png.
Agent View is a dashboard for managing multiple background agent sessions. Tasks run in isolated Git worktrees on auto-generated branches, keeping edits away from your current checkout. Track status, skim output, continue or discard tasks — all from one panel.
Plugins are Python packages discovered via the uv_agent.plugins entry point. They
run in the uv-agent host process and can register runtime helpers, subscribe to
events, and submit turns from external systems.
uvx --with your-uv-agent-plugin uv-agent@latestSee Plugin system for details.
Every model turn = stable system prompt + on-demand structured context.
run_pythonis the only external action surface. Scripts execute in a project-shared uv environment and importuv_agent_runtimehelpers. The uv environment and working directory are separate; the cwd can change viaenter_diror Worktree mode.- Runtime context (helper lists, skills, MCP servers, etc.) uses fingerprinted incremental updates — only changed parts are injected, and removed capabilities are explicitly marked.
- Workspace rules are disclosed progressively: index first, full AGENTS.md only when entering the relevant directory.
- Goal mode provides a durable checklist/notes layer independent of the chat transcript, preserving task progress across compaction and resume.
- Checkpoint compaction summarizes the conversation while excluding reloadable runtime context. New epochs replay structured context before retained history.
uv-agent is self-bootstrapping — it is developed using uv-agent itself for reading, editing, testing, and iterating.
uv run pytestLocal debug state, screenshots, config, and run data belong in .uv-agent/ and
should stay out of git.
The cache-aware compaction design draws on the DP compaction algorithm and cache alignment approach from bash-agent, with thanks.
MIT. See LICENSE.
