diff --git a/ARCHITECTURE.md b/ARCHITECTURE.md index ab251478..df0e3b9e 100644 --- a/ARCHITECTURE.md +++ b/ARCHITECTURE.md @@ -112,11 +112,12 @@ graph TB TS_Memes["memes.*
• search_memes
• send_meme_by_uid"] end - subgraph IntelligentAgents["智能体 Agents (skills/agents/, 6个)"] + subgraph IntelligentAgents["智能体 Agents (skills/agents/, 7个)"] A_Info["info_agent
信息查询助手
(18个工具)
• weather_query
• *hot 热搜
• bilibili_*
• arxiv_search
• whois"] A_Web["web_agent
网络搜索助手
(3个工具 + MCP)
• web_search
• crawl_webpage
• Playwright MCP"] A_File["file_analysis_agent
文件分析助手
(14个工具)
• extract_* (PDF/Word/Excel/PPT)
• analyze_code
• analyze_multimodal"] A_Naga["naga_code_analysis_agent
NagaAgent 代码分析
(7个工具)
• read_file / glob
• search_file_content"] + A_Self["undefined_self_code_agent
Undefined 自身代码查阅
(4个工具)
• read_file / list_directory
• glob / search_file_content"] A_Entertainment["entertainment_agent
娱乐助手
(9个工具)
• ai_draw_one
• horoscope
• video_random_recommend"] A_Code["code_delivery_agent
代码交付助手
(13个工具)
• Docker 容器隔离
• Git 仓库克隆
• 代码编写验证
• 打包上传"] end @@ -331,7 +332,7 @@ graph TB class Dir_History,Dir_FAQ,Dir_TokenUsage,Dir_Cognitive,File_Memory,File_EndSummary,File_ScheduledTasks,Dir_Logs,File_Config persistence class Prompts,Intros resource class QueueManager,ModelQueues,DispatcherLoop queue - class A_Info,A_Web,A_File,A_Naga,A_Entertainment,A_Code agent + class A_Info,A_Web,A_File,A_Naga,A_Self,A_Entertainment,A_Code agent ``` ## 二、数据流向图 @@ -493,6 +494,7 @@ graph TB WebAgent["web_agent
网络搜索
• MCP Playwright"] FileAgent["file_analysis_agent
文件分析"] NagaAgent["naga_code_analysis_agent
代码分析"] + SelfCodeAgent["undefined_self_code_agent
自身代码查阅"] EntAgent["entertainment_agent
娱乐"] CodeAgent["code_delivery_agent
代码交付"] end @@ -532,6 +534,7 @@ graph TB AgentToolReg --> WebAgent AgentToolReg --> FileAgent AgentToolReg --> NagaAgent + AgentToolReg --> SelfCodeAgent AgentToolReg --> EntAgent WebAgent --> MCPAgent @@ -854,7 +857,7 @@ description: 从 PDF 文件中提取文本和表格,填写表单。当用户 自动提取由 `PipelineRegistry` 并行检测、并行处理全部命中的管线;发送结果写入历史后继续进入 AI 自动回复。 4. **AI 核心能力层**:AIClient (ai/client/ + client.py shim)、PromptBuilder (ai/prompts/ + prompts.py shim)、ModelRequester (ai/llm/ + llm.py shim)、ToolManager (tooling.py)、MultimodalAnalyzer (ai/multimodal/ + multimodal.py shim)、SummaryService (summaries.py)、TokenCounter (tokens.py) 5. **存储与上下文层**:MessageHistoryManager (utils/history.py, 10000条限制)、MemoryStorage (memory.py, 置顶备忘录, 500条上限)、EndSummaryStorage、CognitiveService + JobQueue + HistorianWorker + VectorStore + ProfileStorage、MemeService + MemeWorker + MemeStore + MemeVectorStore (表情包库)、FAQStorage、ScheduledTaskStorage、TokenUsageStorage (自动归档) -6. **技能系统层**:ToolRegistry (registry.py)、AgentRegistry、6个 Agents、11类 Toolsets +6. **技能系统层**:ToolRegistry (registry.py)、AgentRegistry、7个 Agents、11类 Toolsets 7. **异步 IO 层**:统一 IO 工具 (utils/io.py),包含 write_json、read_json、append_line、跨平台文件锁 (flock/msvcrt) 8. **数据持久化层**:历史数据目录、FAQ 目录、Token 归档目录、记忆文件、总结文件、定时任务文件 @@ -868,7 +871,7 @@ description: 从 PDF 文件中提取文本和表格,填写表单。当用户 * **优先级管理**:支持四级优先级(超级管理员 > 私聊 > 群聊@ > 群聊普通),确保重要消息优先响应。 * **关停收敛**:`MessageHandler.close()` 会先 flush `MessageBatcher`,再调用 `QueueManager.drain()` 等待已入队请求和在途请求自然完成,最后才停止队列处理器,避免缓冲消息只入队未执行。 -### 6个智能体 Agent +### 7个智能体 Agent | Agent | 功能定位 | 工具数量 | 核心能力 | |-------|---------|---------|---------| @@ -876,6 +879,7 @@ description: 从 PDF 文件中提取文本和表格,填写表单。当用户 | **web_agent** | 网络搜索助手 | 3个 + MCP | 网页搜索、爬虫、Playwright MCP | | **file_analysis_agent** | 文件分析助手 | 14个 | PDF/Word/Excel/PPT解析、代码分析、多模态分析 | | **naga_code_analysis_agent** | NagaAgent 代码分析 | 7个 | 代码库浏览、文件搜索、目录遍历 | +| **undefined_self_code_agent** | Undefined 自身代码查阅 | 4个 | 受限读取源码、测试、文档、资源、脚本与 App | | **entertainment_agent** | 娱乐助手 | 9个 | AI 绘图、星座运势、小说搜索、随机视频推荐等 | | **code_delivery_agent** | 代码交付助手 | 13个 | Docker 隔离、仓库克隆、代码验证、打包上传 | diff --git a/CHANGELOG.md b/CHANGELOG.md index f63f2b15..a5aaec18 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,3 +1,9 @@ +## Unreleased + +- 新增 `undefined_self_code_agent`,允许 AI 只读查阅 Undefined 自身源码、测试、文档、资源、脚本、配置示例和 App 实现;访问范围限制为指定目录和根目录公开文档,并补充主提示词路由、文档与测试。 + +--- + ## v3.5.1 安全回复、群聊边界与配置表单优化 本版本聚焦三个实际使用中的细节问题:一是群聊里只出现「你/我/他」等人称时,Undefined 更容易误判成在和自己说话;二是面对 prompt 注入或强行改人设的消息时,防御性回复有时过于模板化、攻击性过强,甚至在生成失败时仍会发送兜底脏话;三是 WebUI 配置页对枚举型字段的输入约束不足,容易让用户手填出不合法或不直观的配置值。v3.5.1 因此收紧对话归属判断,强化人设自洽与防注入边界,并把更多配置项改为下拉选择,降低误配置概率。 diff --git a/README.md b/README.md index 632816d0..a722db1b 100644 --- a/README.md +++ b/README.md @@ -160,7 +160,7 @@ set_config(cfg) # opt-in 注入全局单例;CLI 启动链不会调用 # 自动扫描 skills/:tools + toolsets(end / group.* / cognitive.* …) tools = ToolRegistry() -# 自动扫描 skills/agents/:web_agent、code_delivery_agent … +# 自动扫描 skills/agents/:web_agent、undefined_self_code_agent、code_delivery_agent … agents = AgentRegistry() async def main() -> None: diff --git a/code/NagaAgent b/code/NagaAgent index 5b1ca050..eb71318f 160000 --- a/code/NagaAgent +++ b/code/NagaAgent @@ -1 +1 @@ -Subproject commit 5b1ca050c877e4aed9bdaf0777418a377f631176 +Subproject commit eb71318f76f2195bbca3a583458c058cc80e27f8 diff --git a/docs/usage.md b/docs/usage.md index 9c81a63b..6ab8b90b 100644 --- a/docs/usage.md +++ b/docs/usage.md @@ -118,9 +118,17 @@ Undefined 搭载了基于 ChromaDB 向量数据库的后台认知系统,无需 ### `naga_code_analysis_agent` — NagaAgent 代码分析助手 -专门用于深度分析 NagaAgent 框架及本项目的源代码结构。 +专门用于深度分析 NagaAgent 框架的源代码结构。 -**子工具**:`read_file`、`search_code`、`analyze_structure` +**子工具**:`read_file`、`list_directory`、`glob`、`search_file_content`、`read_naga_intro` + +--- + +### `undefined_self_code_agent` — Undefined 自身代码查阅助手 + +只读查阅 Undefined 当前仓库的源码、测试、文档、资源、脚本、配置示例和 App 实现。访问范围限制为 `src/`、`scripts/`、`tests/`、`res/`、`docs/`、`apps/` 以及根目录 `README.md`、`CHANGELOG.md`、`ARCHITECTURE.md`、`config.toml.example`。 + +**子工具**:`read_file`、`list_directory`、`glob`、`search_file_content` --- diff --git a/res/prompts/undefined.xml b/res/prompts/undefined.xml index 80a375b7..93986e1f 100644 --- a/res/prompts/undefined.xml +++ b/res/prompts/undefined.xml @@ -819,6 +819,7 @@ 善用工具 + 需要查阅 Undefined 自身源码、测试、文档、资源、脚本、配置示例或 App 实现时,调用 undefined_self_code_agent 需要了解图片内容时,调用 file_analysis_agent 需要记住长期稳定的重要信息时,调用 memory.add(或 tools 列表中的对应名称) **不要主动调用无关工具**(天气、金价、新闻等),除非被明确要求 @@ -834,7 +835,7 @@ **先识别,再搜索,最后综合**:遇到图片/文件+问题的组合时,第一步只做内容识别,拿到识别结果后再决定是否需要搜索。 **prompt 只描述 Agent 能力范围内的任务**:调用 file_analysis_agent 时 prompt 应该是"识别图中的游戏和角色名",而不是"分析这个角色怎么养成"。 - **不要指望 Agent 做它不擅长的事**:file_analysis_agent 没有搜索能力,不要让它回答需要外部知识的问题;web_agent 看不到图片,不要让它分析文件。 + **不要指望 Agent 做它不擅长的事**:file_analysis_agent 没有搜索能力,不要让它回答需要外部知识的问题;web_agent 看不到图片,不要让它分析文件;undefined_self_code_agent 仅可只读查阅 Undefined 自身代码,不能写代码或执行命令。 **你是指挥官,Agent 是专家**:你负责拆解任务、分配工作、综合结果。每个 Agent 只提供它专业领域的原子输出。 **能并行就并行**:多个 Agent 调用之间如果没有数据依赖,应在同一轮响应中并行调用以减少延迟。但如果后一个 Agent 的 prompt 依赖前一个 Agent 的结果,则必须等前一个返回后再调用。 **Agent 间互调**:有些 Agent 内部可以调用其他 Agent,以提高效率。这是正常的系统行为,不需要你手动干预。 diff --git a/res/prompts/undefined_nagaagent.xml b/res/prompts/undefined_nagaagent.xml index bc7141de..fa4fd188 100644 --- a/res/prompts/undefined_nagaagent.xml +++ b/res/prompts/undefined_nagaagent.xml @@ -870,6 +870,7 @@ 善用工具 遇到 NagaAgent 问题,直接调用 naga_code_analysis_agent + 需要查阅 Undefined 自身源码、测试、文档、资源、脚本、配置示例或 App 实现时,调用 undefined_self_code_agent 需要了解图片内容时,调用 file_analysis_agent 需要记住长期稳定的重要信息时,调用 memory.add(或 tools 列表中的对应名称) **不要主动调用无关工具**(天气、金价、新闻等),除非被明确要求 @@ -885,7 +886,7 @@ **先识别,再搜索,最后综合**:遇到图片/文件+问题的组合时,第一步只做内容识别,拿到识别结果后再决定是否需要搜索。 **prompt 只描述 Agent 能力范围内的任务**:调用 file_analysis_agent 时 prompt 应该是"识别图中的游戏和角色名",而不是"分析这个角色怎么养成"。 - **不要指望 Agent 做它不擅长的事**:file_analysis_agent 没有搜索能力,不要让它回答需要外部知识的问题;web_agent 看不到图片,不要让它分析文件。 + **不要指望 Agent 做它不擅长的事**:file_analysis_agent 没有搜索能力,不要让它回答需要外部知识的问题;web_agent 看不到图片,不要让它分析文件;undefined_self_code_agent 仅可只读查阅 Undefined 自身代码,不能写代码或执行命令。 **你是指挥官,Agent 是专家**:你负责拆解任务、分配工作、综合结果。每个 Agent 只提供它专业领域的原子输出。 **能并行就并行**:多个 Agent 调用之间如果没有数据依赖,应在同一轮响应中并行调用以减少延迟。但如果后一个 Agent 的 prompt 依赖前一个 Agent 的结果,则必须等前一个返回后再调用。 **Agent 间互调**:有些 Agent 内部可以调用其他 Agent,以提高效率。这是正常的系统行为,不需要你手动干预。 diff --git a/src/Undefined/skills/agents/README.md b/src/Undefined/skills/agents/README.md index f1acc322..f9acf715 100644 --- a/src/Undefined/skills/agents/README.md +++ b/src/Undefined/skills/agents/README.md @@ -271,9 +271,15 @@ mv skills/tools/my_tool skills/agents/my_agent/tools/ - **子工具**:`read_file`, `analyze_code`, `analyze_pdf`, `analyze_docx`, `analyze_xlsx` ### naga_code_analysis_agent(NagaAgent 代码分析助手) -- **功能**:专门用于分析 NagaAgent 框架及当前项目的源码 -- **适用场景**:深入分析 NagaAgent 架构、项目代码审查 -- **子工具**:`read_file`, `search_code`, `analyze_structure` +- **功能**:专门用于分析 NagaAgent 框架源码 +- **适用场景**:深入分析 NagaAgent 架构、模块实现、代码线索 +- **子工具**:`read_file`, `list_directory`, `glob`, `search_file_content`, `read_naga_intro` + +### undefined_self_code_agent(Undefined 自身代码查阅助手) +- **功能**:只读查阅 Undefined 当前仓库的源码、测试、文档、资源、脚本、配置示例和 App 实现 +- **适用场景**:解释 Undefined 自身实现、定位模块、核对配置示例、查看测试覆盖 +- **访问范围**:`src/`, `scripts/`, `tests/`, `res/`, `docs/`, `apps/`, `README.md`, `CHANGELOG.md`, `ARCHITECTURE.md`, `config.toml.example` +- **子工具**:`read_file`, `list_directory`, `glob`, `search_file_content` ### info_agent(信息查询助手) - **功能**:查询天气、热搜、历史、WHOIS、B 站信息、arXiv 检索等 diff --git a/src/Undefined/skills/agents/naga_code_analysis_agent/tools/read_naga_intro/handler.py b/src/Undefined/skills/agents/naga_code_analysis_agent/tools/read_naga_intro/handler.py index 54cfa4b6..736b3518 100644 --- a/src/Undefined/skills/agents/naga_code_analysis_agent/tools/read_naga_intro/handler.py +++ b/src/Undefined/skills/agents/naga_code_analysis_agent/tools/read_naga_intro/handler.py @@ -3,29 +3,40 @@ # NagaAgent 项目介绍内容(直接嵌入以保证稳定性) NAGA_INTRO_CONTENT = """ ## 项目概览 -- 后端:Python 3.11 + FastAPI + LiteLLM + Pydantic。 -- 前端:Vue 3 + TypeScript + Vite + Electron + PrimeVue + UnoCSS。 -- 统一入口:`main.py`(并行拉起 API/Agent/TTS 等服务)。 -- 配置中心:`system/config.py` + `config.json`(支持 JSON5 注释解析)。 -- 默认端口:API 8000、Agent 8001、MCP 8003(保留)、TTS 5048。 +- README 标识版本 5.1.0。 +- 后端:Python 3.11 + FastAPI + OpenAI/Anthropic/LiteLLM 兼容调用 + Pydantic。 +- 前端:Vue 3 + TypeScript + Vite + Electron,入口在 `frontend/`。 +- 统一入口:`main.py`,负责启动后台任务、API Server、MCP Server、Agent Server、TTS 等服务。 +- 配置中心:`system/config.py` + `config.json`/`config.json.example`,支持运行时配置同步与热更新。 +- 默认端口:API 8000、Agent 8001、MCP 8003、TTS 5048、ASR 5060。 +- 核心能力:流式工具调用、GRAG 知识图谱记忆、MCP 服务、Anthropic-style skills、OpenClaw 电脑操作、DogTag 心跳/屏幕主动感知、旅行探索、游戏攻略。 ## 快速定位 - 服务并行启动逻辑:`main.py` -- API 路由(如 `/chat`、`/chat/stream`):`apiserver/api_server.py` +- API 应用入口与共享状态:`apiserver/api_server.py` +- API 路由(如 `/chat`、`/chat/stream`、配置、会话、工具、论坛、扩展):`apiserver/routes/` - 模型调用/参数拼装:`apiserver/llm_service.py` -- 流式工具调用提取:`apiserver/streaming_tool_extractor.py` +- Agentic 工具调用循环:`apiserver/agentic_tool_loop.py` +- 流式文本处理与 TTS 推送:`apiserver/streaming_tool_extractor.py` - 会话与消息管理:`apiserver/message_manager.py` -- Agent 调度/OpenClaw 执行:`agentserver/agent_server.py` -- 任务调度与任务记忆:`agentserver/task_scheduler.py` +- 上下文压缩:`apiserver/context_compressor.py` +- NagaCAS 登录与认证:`apiserver/naga_auth.py`、`apiserver/routes/auth.py` +- 运行时控制(如语音暂停):`apiserver/naga_control.py` +- Agent 调度服务:`agentserver/agent_server.py` +- DogTag 心跳/屏幕主动感知:`agentserver/dogtag/` - OpenClaw 连接与运行时:`agentserver/openclaw/` +- MCP 管理与服务注册:`mcpserver/mcp_manager.py`、`mcpserver/mcp_registry.py` +- 内置 MCP agents:`mcpserver/agent_*` - 全局配置结构与端口:`system/config.py` - 配置热更新接口:`system/config_manager.py` -- 系统提示词:`system/prompts/conversation_style_prompt.txt`、`system/prompts/conversation_analyzer_prompt.txt` +- 角色包与提示词:`system/character_bundle.py`、`system/prompts/` - 语音输出服务:`voice/output/start_voice_service.py`、`voice/output/server.py` - 实时语音输入链路:`voice/input/voice_realtime/` - 前端页面路由与主界面:`frontend/src/views/`、`frontend/src/App.vue` - 前端 API 封装:`frontend/src/api/` - Electron 主进程与后端拉起:`frontend/electron/main.ts`、`frontend/electron/modules/backend.ts` +- 技能定义:`skills/*/SKILL.md` +- 游戏攻略/画面理解:`guide_engine/` - 长期记忆(GRAG/图谱):`summer_memory/` ## 目录与文件说明 @@ -33,28 +44,31 @@ | 路径 | 作用 | 常改文件 | |---|---|---| | `main.py` | 项目总入口,负责并行启动服务、端口检查、代理初始化 | `main.py` | -| `apiserver/` | 对话 API 核心(路由、LLM 调用、流式输出、工具调用循环) | `api_server.py`、`llm_service.py`、`streaming_tool_extractor.py`、`message_manager.py` | -| `agentserver/` | 任务调度与电脑控制执行服务(OpenClaw) | `agent_server.py`、`task_scheduler.py`、`openclaw/*.py` | +| `apiserver/` | 对话 API 核心(路由、LLM 调用、流式输出、工具调用循环、认证、论坛代理) | `api_server.py`、`routes/*.py`、`llm_service.py`、`agentic_tool_loop.py` | +| `agentserver/` | Agent 调度、DogTag 心跳/屏幕感知、OpenClaw 集成 | `agent_server.py`、`dogtag/*.py`、`openclaw/*.py` | +| `mcpserver/` | MCP 服务管理、内置 MCP agents、统一工具调用路由 | `mcp_manager.py`、`mcp_registry.py`、`agent_*` | | `system/` | 配置系统、提示词、环境检测、日志初始化 | `config.py`、`config_manager.py`、`system_checker.py`、`prompts/*.txt` | | `voice/` | 语音输入输出能力(TTS/Realtime) | `output/start_voice_service.py`、`output/server.py`、`input/unified_voice_manager.py` | | `summer_memory/` | 记忆系统与图谱检索(五元组、RAG、任务记忆) | `memory_manager.py`、`quintuple_extractor.py`、`quintuple_rag_query.py` | | `frontend/` | Vue3 + Electron 前端 | `src/views/*.vue`、`src/api/*.ts`、`electron/main.ts` | +| `guide_engine/` | 游戏攻略、截图识别、RAG/图谱查询与提示词管理 | `guide_service.py`、`query_router.py`、`screenshot_provider.py` | | `skills/` | 内置技能定义(SKILL.md) | `*/SKILL.md` | | `scripts/` | 构建/自动化脚本 | `build-win.py` | | `logs/` | 日志与运行期输出目录 | `logs/*.log` | ## 根目录关键文件(排查优先看) - `config.json`:运行配置(若不存在会尝试由 `config.json.example` 生成)。 -- `pyproject.toml`:Python 依赖与版本约束(`>=3.11,<3.12`)。 +- `pyproject.toml`:项目版本、Python 依赖与版本约束(`>=3.11,<3.12`)。 - `uv.lock`:`uv` 锁定依赖版本。 - `requirements.txt`:传统 pip 安装依赖清单。 - `build.md`:完整打包说明。 -- `naga-backend.spec`:PyInstaller 打包配置。 +- `build.py` / `naga-backend.spec`:跨平台构建与 PyInstaller 打包配置。 - `start.bat`、`setup_venv.bat`:Windows 启动/环境脚本。 +- `proactive_vision_config.json`:屏幕主动感知默认/运行配置。 ## 当前目录状态提示 -- `game/`、`mcpserver/`、`mqtt_tool/`、`nagaagent_core/`、`models/`、`ui/` 目前主要是历史目录/缓存占位(源码文件基本不在这些目录中)。 -- 现阶段开发优先从 `main.py`、`apiserver/`、`agentserver/`、`system/`、`voice/`、`frontend/`、`summer_memory/` 入手。 +- 现阶段开发优先从 `main.py`、`apiserver/`、`agentserver/`、`mcpserver/`、`system/`、`voice/`、`frontend/`、`guide_engine/`、`summer_memory/`、`skills/` 入手。 +- `characters/` 存放角色资源,`vendor/openclaw/` 是 OpenClaw vendor 源码/运行时相关内容。 ## 环境准备 ```bash @@ -67,18 +81,18 @@ pip install -r requirements.txt ``` - Python 版本必须满足:`>=3.11,<3.12`。 +- `api.api_format` 支持 `openai` 与 `anthropic`,默认示例为 DeepSeek OpenAI-compatible API。 - 默认优先使用 `uv run ...` 运行命令。 -## 启动命令 -```bash -cd frontend/ -npm run dev -# 前端 Electron 主进程会调用 backend 模块拉起根目录 main.py -``` +## 启动相关 +- 服务统一入口在 `main.py`。 +- 前端 Electron 主进程会通过 `frontend/electron/modules/backend.ts` 拉起后端。 +- API 与 Agent Server 可从 `apiserver/`、`agentserver/` 下的入口文件继续追踪。 ## 打包相关 -- Windows 一键构建:`python scripts/build-win.py`。 -- 详细流程见:`build.md`。 +- 跨平台构建入口文件:`build.py`。 +- Windows 构建脚本位于 `scripts/`。 +- 详细流程见:`build.md`、`docs/build-windows.md`。 """ diff --git a/src/Undefined/skills/agents/undefined_self_code_agent/README.md b/src/Undefined/skills/agents/undefined_self_code_agent/README.md new file mode 100644 index 00000000..a9dad1ee --- /dev/null +++ b/src/Undefined/skills/agents/undefined_self_code_agent/README.md @@ -0,0 +1,18 @@ +# undefined_self_code_agent 智能体 + +面向 Undefined 当前仓库的只读代码查阅助手,提供受限文件读取、目录浏览、glob 匹配和内容检索能力。 + +目录结构: +- `config.json`:智能体定义 +- `intro.md`:给主 AI 看的能力说明 +- `prompt.md`:智能体系统提示词 +- `tools/`:只读代码查阅工具集合 + +访问范围: +- 目录:`src/`、`scripts/`、`tests/`、`res/`、`docs/`、`apps/` +- 根文件:`README.md`、`CHANGELOG.md`、`ARCHITECTURE.md`、`config.toml.example` + +运行机制: +- 由 `AgentRegistry` 自动发现并注册 +- 子工具统一复用 `tools/_shared.py` 的路径白名单与文本读取逻辑 +- 不提供写入、命令执行或联网能力 diff --git a/src/Undefined/skills/agents/undefined_self_code_agent/__init__.py b/src/Undefined/skills/agents/undefined_self_code_agent/__init__.py new file mode 100644 index 00000000..d5f7abfb --- /dev/null +++ b/src/Undefined/skills/agents/undefined_self_code_agent/__init__.py @@ -0,0 +1 @@ +"""Undefined self code inspection agent.""" diff --git a/src/Undefined/skills/agents/undefined_self_code_agent/config.json b/src/Undefined/skills/agents/undefined_self_code_agent/config.json new file mode 100644 index 00000000..87b240ac --- /dev/null +++ b/src/Undefined/skills/agents/undefined_self_code_agent/config.json @@ -0,0 +1,17 @@ +{ + "type": "function", + "function": { + "name": "undefined_self_code_agent", + "description": "Undefined 自身代码查阅助手,用于只读查询当前 Undefined 仓库的源码、测试、文档、资源、脚本与 App 实现细节。", + "parameters": { + "type": "object", + "properties": { + "prompt": { + "type": "string", + "description": "需要查阅 Undefined 自身代码或文档的具体问题" + } + }, + "required": ["prompt"] + } + } +} diff --git a/src/Undefined/skills/agents/undefined_self_code_agent/handler.py b/src/Undefined/skills/agents/undefined_self_code_agent/handler.py new file mode 100644 index 00000000..d5577855 --- /dev/null +++ b/src/Undefined/skills/agents/undefined_self_code_agent/handler.py @@ -0,0 +1,29 @@ +from __future__ import annotations + +import logging +from pathlib import Path +from typing import Any + +from Undefined.skills.agents.runner import ( + DEFAULT_AGENT_MAX_ITERATIONS, + run_agent_with_tools, +) + +logger = logging.getLogger(__name__) + + +async def execute(args: dict[str, Any], context: dict[str, Any]) -> str: + """执行 undefined_self_code_agent。""" + + user_prompt = str(args.get("prompt", "")).strip() + return await run_agent_with_tools( + agent_name="undefined_self_code_agent", + user_content=user_prompt, + empty_user_content_message="请提供要查阅的 Undefined 代码问题", + default_prompt="你是 Undefined 项目的只读代码查阅助手。", + context=context, + agent_dir=Path(__file__).parent, + logger=logger, + max_iterations=DEFAULT_AGENT_MAX_ITERATIONS, + tool_error_prefix="错误", + ) diff --git a/src/Undefined/skills/agents/undefined_self_code_agent/intro.md b/src/Undefined/skills/agents/undefined_self_code_agent/intro.md new file mode 100644 index 00000000..1c51e112 --- /dev/null +++ b/src/Undefined/skills/agents/undefined_self_code_agent/intro.md @@ -0,0 +1,21 @@ +# Undefined 自身代码查阅助手 + +## 定位 +只用于回答 **Undefined 项目自身** 的源码、测试、文档、资源、脚本、配置示例和 App 实现细节问题。 + +## 擅长 +- 查阅 `src/`、`scripts/`、`tests/`、`res/`、`docs/`、`apps/` 下的文件 +- 查阅根目录 `README.md`、`CHANGELOG.md`、`ARCHITECTURE.md`、`config.toml.example` +- 浏览目录、按 glob 查找文件、按关键词或正则搜索代码内容 +- 基于实时读取到的文件内容解释当前实现 + +## 边界 +- 只读查阅,不修改文件、不运行命令、不联网搜索 +- 不读取未列入白名单的路径,例如 `.env`、`data/`、`logs/`、`code/`、`pyproject.toml` +- NagaAgent 相关技术问题仍交给 `naga_code_analysis_agent` +- 用户上传文件或外部文件解析仍交给 `file_analysis_agent` +- 代码编写、修改和交付仍交给 `code_delivery_agent` + +## 输入偏好 +- 明确的模块、文件、报错、配置项、测试名或功能点 +- 若问题较宽泛,会先通过目录、glob 或内容搜索缩小范围 diff --git a/src/Undefined/skills/agents/undefined_self_code_agent/prompt.md b/src/Undefined/skills/agents/undefined_self_code_agent/prompt.md new file mode 100644 index 00000000..8e0a904c --- /dev/null +++ b/src/Undefined/skills/agents/undefined_self_code_agent/prompt.md @@ -0,0 +1,20 @@ +你是 Undefined 项目的只读代码查阅助手,目标是帮助用户理解当前 Undefined 仓库内部实现。 + +工作原则: +- 先判断问题是否与 Undefined 自身源码、测试、文档、资源、脚本、配置示例或 App 实现有关。 +- 如果是宽泛问题,先用 `list_directory`、`glob` 或 `search_file_content` 定位相关文件,再深入具体内容。 +- 用工具获取证据后再下结论,避免凭记忆或猜测回答。 +- 路径只能使用仓库相对路径;不要要求读取绝对路径。 +- 只允许查阅 `src/`、`scripts/`、`tests/`、`res/`、`docs/`、`apps/`,以及根目录 `README.md`、`CHANGELOG.md`、`ARCHITECTURE.md`、`config.toml.example`。 +- 禁止尝试读取 `.env`、`data/`、`logs/`、`.git/`、`code/`、根目录其它文件或任何越界路径。 +- 你只能查阅和解释,不修改代码、不运行命令、不联网搜索。 +- NagaAgent 相关技术问题不由你处理,应建议使用 `naga_code_analysis_agent`。 +- 用户上传文件或外部文件解析不由你处理,应建议使用 `file_analysis_agent`。 +- 代码编写、修改、验证和打包不由你处理,应建议使用 `code_delivery_agent`。 + +表达风格: +- 简洁、结构化,先给结论再给依据。 +- 引用文件路径时使用仓库相对路径。 +- 如果依据不足,说明还需要查阅哪个文件或让用户缩小范围。 + +如果问题涉及“当前时间/今日”等,且工具可用,先调用 `get_current_time` 校准时间。 diff --git a/src/Undefined/skills/agents/undefined_self_code_agent/tools/__init__.py b/src/Undefined/skills/agents/undefined_self_code_agent/tools/__init__.py new file mode 100644 index 00000000..3acb0e63 --- /dev/null +++ b/src/Undefined/skills/agents/undefined_self_code_agent/tools/__init__.py @@ -0,0 +1 @@ +"""Tools for Undefined self code inspection agent.""" diff --git a/src/Undefined/skills/agents/undefined_self_code_agent/tools/_shared.py b/src/Undefined/skills/agents/undefined_self_code_agent/tools/_shared.py new file mode 100644 index 00000000..1f71af39 --- /dev/null +++ b/src/Undefined/skills/agents/undefined_self_code_agent/tools/_shared.py @@ -0,0 +1,416 @@ +from __future__ import annotations + +import fnmatch +import re +from collections.abc import AsyncIterator +from dataclasses import dataclass +from pathlib import Path, PurePosixPath, PureWindowsPath +from typing import Any + +from Undefined.utils import io as async_io + + +ALLOWED_DIRECTORIES: tuple[str, ...] = ( + "src", + "scripts", + "tests", + "res", + "docs", + "apps", +) +ALLOWED_ROOT_FILES: tuple[str, ...] = ( + "README.md", + "CHANGELOG.md", + "ARCHITECTURE.md", + "config.toml.example", +) +PROJECT_MARKERS: tuple[str, ...] = ( + "pyproject.toml", + "src/Undefined", + "config.toml.example", +) +EXCLUDED_DIR_NAMES: frozenset[str] = frozenset( + { + ".git", + ".hg", + ".svn", + ".mypy_cache", + ".pytest_cache", + ".ruff_cache", + ".cache", + ".venv", + "__pycache__", + "node_modules", + "target", + "dist", + "build", + ".vite", + "coverage", + } +) +MAX_TEXT_BYTES = 1_500_000 +DEFAULT_MAX_CHARS = 60_000 +DEFAULT_LINE_LIMIT = 200 +DEFAULT_MAX_RESULTS = 100 +DEFAULT_MAX_MATCHES = 100 +MAX_LINE_LEN = 500 + + +@dataclass(frozen=True) +class ResolvedPath: + repo_root: Path + path: Path + rel_path: str + + +def allowed_roots_text() -> str: + """返回允许访问范围说明。""" + + dirs = ", ".join(f"{name}/" for name in ALLOWED_DIRECTORIES) + files = ", ".join(ALLOWED_ROOT_FILES) + return f"允许目录: {dirs}; 允许根文件: {files}" + + +def find_repo_root(context: dict[str, Any]) -> Path: + """解析 Undefined 仓库根目录。""" + + raw_root = context.get("repo_root") or context.get("project_root") + candidates: list[Path] = [] + if raw_root: + candidates.append(Path(raw_root)) + candidates.append(Path.cwd()) + candidates.extend(Path.cwd().parents) + current = Path(__file__).resolve() + candidates.extend(current.parents) + + seen: set[Path] = set() + for candidate in candidates: + root = candidate.resolve() + if root in seen: + continue + seen.add(root) + if all((root / marker).exists() for marker in PROJECT_MARKERS): + return root + + raise ValueError("无法定位 Undefined 仓库根目录") + + +def _normalize_rel_path(value: str | None) -> str: + rel = str(value or "").strip().replace("\\", "/") + while rel.startswith("./"): + rel = rel[2:] + return rel.rstrip("/") + + +def _is_excluded_by_parts(path: Path, repo_root: Path) -> bool: + try: + parts = path.relative_to(repo_root).parts + except ValueError: + return True + return any(part in EXCLUDED_DIR_NAMES or part.startswith(".") for part in parts) + + +def _is_allowed_relative(rel_path: str, *, allow_root: bool) -> bool: + if rel_path in {"", "."}: + return allow_root + if rel_path in ALLOWED_ROOT_FILES: + return True + first = rel_path.split("/", 1)[0] + return first in ALLOWED_DIRECTORIES + + +def is_allowed_path(path: Path, repo_root: Path, *, allow_root: bool = False) -> bool: + """判断路径是否位于允许访问范围内。""" + + try: + rel = path.resolve().relative_to(repo_root.resolve()).as_posix() + except ValueError: + return False + if _is_excluded_by_parts(path.resolve(), repo_root.resolve()): + return False + return _is_allowed_relative(rel, allow_root=allow_root) + + +def resolve_allowed_path( + path_value: str | None, + context: dict[str, Any], + *, + allow_root: bool = False, +) -> ResolvedPath: + """解析并校验仓库相对路径。""" + + repo_root = find_repo_root(context) + rel = _normalize_rel_path(path_value) + target = (repo_root / rel).resolve() if rel else repo_root.resolve() + + try: + rel_path = target.relative_to(repo_root).as_posix() + except ValueError as exc: + raise PermissionError(f"路径越界: {path_value}") from exc + + if rel_path == ".": + rel_path = "" + if not is_allowed_path(target, repo_root, allow_root=allow_root): + raise PermissionError(f"路径不在允许范围内: {path_value or '.'}") + + return ResolvedPath(repo_root=repo_root, path=target, rel_path=rel_path) + + +def resolve_search_root( + path_value: str | None, + context: dict[str, Any], +) -> ResolvedPath: + """解析搜索根路径,空路径表示整个允许范围。""" + + return resolve_allowed_path(path_value, context, allow_root=True) + + +def _iter_allowed_roots(repo_root: Path, root: Path | None = None) -> list[Path]: + """返回允许范围内的扫描根。""" + + base = (root or repo_root).resolve() + if base == repo_root.resolve(): + roots = [repo_root / name for name in ALLOWED_DIRECTORIES] + roots.extend(repo_root / name for name in ALLOWED_ROOT_FILES) + return roots + return [base] + + +async def path_exists(path: Path) -> bool: + """异步检查路径是否存在。""" + + return await async_io.exists(path) + + +async def path_is_file(path: Path) -> bool: + """异步检查路径是否为普通文件。""" + + return await async_io.is_file(path) + + +async def path_is_dir(path: Path) -> bool: + """异步检查路径是否为目录。""" + + return await async_io.is_dir(path) + + +async def iter_allowed_files( + repo_root: Path, + root: Path | None = None, +) -> AsyncIterator[Path]: + """异步遍历允许范围内的文件。""" + + for item in _iter_allowed_roots(repo_root, root): + if not await path_exists(item): + continue + if await path_is_file(item): + if is_allowed_path(item, repo_root): + yield item + continue + if not await path_is_dir(item) or not is_allowed_path(item, repo_root): + continue + async for path in async_io.iter_rglob_files(item): + if is_allowed_path(path, repo_root): + yield path + + +def _normalize_glob_pattern(pattern: str) -> str: + normalized = pattern.strip().replace("\\", "/") + if not normalized: + raise ValueError("glob 模式不能为空") + if ( + PurePosixPath(normalized).is_absolute() + or PureWindowsPath(pattern).is_absolute() + ): + raise ValueError("glob 模式不能是绝对路径") + if ".." in normalized.split("/"): + raise ValueError("glob 模式不能包含 ..") + return normalized + + +def _has_glob_magic(value: str) -> bool: + return any(char in value for char in "*?[") + + +def _root_file_matches(pattern: str, rel_path: str) -> bool: + if pattern.startswith("**/"): + return _root_file_matches(pattern[3:], rel_path) + return PurePosixPath(rel_path).match(pattern) or fnmatch.fnmatchcase( + rel_path, + pattern, + ) + + +def _iter_constrained_glob_roots( + repo_root: Path, + search_root: Path, + pattern: str, +) -> list[tuple[Path, str]]: + if search_root.resolve() != repo_root.resolve(): + return [(search_root.resolve(), pattern)] + + first, separator, remainder = pattern.partition("/") + if not separator: + return [] + if first == "**": + return [(repo_root / name, pattern) for name in ALLOWED_DIRECTORIES] + if first in ALLOWED_DIRECTORIES: + return [(repo_root / first, remainder or "*")] + if _has_glob_magic(first): + return [ + (repo_root / name, remainder or "*") + for name in ALLOWED_DIRECTORIES + if fnmatch.fnmatchcase(name, first) + ] + return [] + + +def collect_allowed_glob_matches( + repo_root: Path, + search_root: Path, + pattern: str, + max_results: int, +) -> list[str]: + """按 glob 收集允许范围内的文件路径。调用方需用 asyncio.to_thread 包装。""" + + pattern = _normalize_glob_pattern(pattern) + + matches: list[str] = [] + seen: set[str] = set() + + def add_match(candidate: Path) -> bool: + rel = format_relative(candidate, repo_root) + if rel in seen: + return False + seen.add(rel) + matches.append(rel) + return len(matches) >= max_results + + if search_root.resolve() == repo_root.resolve(): + for name in ALLOWED_ROOT_FILES: + candidate = repo_root / name + if not candidate.is_file() or not _root_file_matches(pattern, name): + continue + if add_match(candidate): + return sorted(matches) + elif search_root.is_file(): + rel = format_relative(search_root, repo_root) + if is_allowed_path(search_root, repo_root) and _root_file_matches(pattern, rel): + add_match(search_root) + return sorted(matches) + + for root, root_pattern in _iter_constrained_glob_roots( + repo_root, + search_root, + pattern, + ): + if not root.exists() or not root.is_dir(): + continue + for candidate in root.glob(root_pattern): + if not candidate.is_file(): + continue + if not is_allowed_path(candidate, repo_root): + continue + if add_match(candidate): + return sorted(matches) + return sorted(matches) + + +async def list_allowed_directory_entries( + repo_root: Path, + directory: Path, +) -> list[tuple[str, bool]]: + """异步列出允许目录项,返回 (relative_path, is_dir)。""" + + entries = sorted( + await async_io.list_directory_entries(directory), + key=lambda item: (not item[1], item[0].name.lower()), + ) + result: list[tuple[str, bool]] = [] + for entry, is_dir in entries: + if not is_allowed_path(entry, repo_root): + continue + result.append((format_relative(entry, repo_root), is_dir)) + return result + + +def is_probably_text(raw: bytes) -> bool: + """粗略判断字节内容是否适合作为文本读取。""" + + if b"\x00" in raw: + return False + if not raw: + return True + control = sum(1 for value in raw if value < 32 and value not in {9, 10, 12, 13}) + return control <= max(12, len(raw) // 20) + + +def decode_text(raw: bytes) -> str: + """按常见源码/文档编码解码文本。""" + + for encoding in ("utf-8-sig", "utf-8", "gb18030", "latin-1"): + try: + return raw.decode(encoding) + except UnicodeDecodeError: + continue + return raw.decode("utf-8", errors="replace") + + +def read_text_file(path: Path) -> tuple[str, bool, int]: + """读取文本文件,返回内容、是否因大小截断、原始字节数。""" + + size = path.stat().st_size + with open(path, "rb") as file: + raw = file.read(MAX_TEXT_BYTES + 1) + truncated_bytes = len(raw) > MAX_TEXT_BYTES + if truncated_bytes: + raw = raw[:MAX_TEXT_BYTES] + if not is_probably_text(raw): + raise UnicodeError("文件看起来是二进制文件") + return decode_text(raw), truncated_bytes, size + + +def clamp_int(value: Any, default: int, minimum: int, maximum: int) -> int: + """将任意输入规范为整数范围。""" + + try: + number = int(value) + except (TypeError, ValueError): + number = default + return max(minimum, min(maximum, number)) + + +def format_relative(path: Path, repo_root: Path) -> str: + """格式化为仓库相对路径。""" + + return path.resolve().relative_to(repo_root.resolve()).as_posix() + + +def path_matches_include(path: Path, repo_root: Path, include: str) -> bool: + """判断文件是否匹配 include glob。""" + + if not include: + return True + rel = format_relative(path, repo_root) + return fnmatch.fnmatch(rel, include) or fnmatch.fnmatch(path.name, include) + + +def compile_pattern( + pattern: str, + *, + is_regex: bool, + case_sensitive: bool, +) -> re.Pattern[str]: + """编译搜索模式。""" + + flags = 0 if case_sensitive else re.IGNORECASE + source = pattern if is_regex else re.escape(pattern) + return re.compile(source, flags) + + +def trim_line(line: str, max_len: int = MAX_LINE_LEN) -> str: + """截断过长的单行输出。""" + + if len(line) <= max_len: + return line + return line[:max_len] + "..." diff --git a/src/Undefined/skills/agents/undefined_self_code_agent/tools/glob/config.json b/src/Undefined/skills/agents/undefined_self_code_agent/tools/glob/config.json new file mode 100644 index 00000000..617f2caa --- /dev/null +++ b/src/Undefined/skills/agents/undefined_self_code_agent/tools/glob/config.json @@ -0,0 +1,25 @@ +{ + "type": "function", + "function": { + "name": "glob", + "description": "在 Undefined 仓库允许范围内按 glob 查找文件。", + "parameters": { + "type": "object", + "properties": { + "pattern": { + "type": "string", + "description": "glob 模式,例如 **/*.py、src/Undefined/**/*.py、docs/*.md" + }, + "base_path": { + "type": "string", + "description": "可选搜索根目录,仓库相对路径" + }, + "max_results": { + "type": "integer", + "description": "最大返回结果数,默认 100,范围 1-500" + } + }, + "required": ["pattern"] + } + } +} diff --git a/src/Undefined/skills/agents/undefined_self_code_agent/tools/glob/handler.py b/src/Undefined/skills/agents/undefined_self_code_agent/tools/glob/handler.py new file mode 100644 index 00000000..33cae56d --- /dev/null +++ b/src/Undefined/skills/agents/undefined_self_code_agent/tools/glob/handler.py @@ -0,0 +1,51 @@ +from __future__ import annotations + +import asyncio +from typing import Any + +from Undefined.skills.agents.undefined_self_code_agent.tools._shared import ( + DEFAULT_MAX_RESULTS, + allowed_roots_text, + clamp_int, + collect_allowed_glob_matches, + path_exists, + resolve_search_root, +) + + +async def execute(args: dict[str, Any], context: dict[str, Any]) -> str: + """按 glob 模式查找允许范围内的文件。""" + + pattern = str(args.get("pattern") or "").strip() + if not pattern: + return "错误:pattern 不能为空" + base_path = str(args.get("base_path") or "").strip() + max_results = clamp_int(args.get("max_results"), DEFAULT_MAX_RESULTS, 1, 500) + + try: + resolved = resolve_search_root(base_path, context) + except PermissionError as exc: + return f"权限不足:{exc}。{allowed_roots_text()}" + except ValueError as exc: + return f"错误:{exc}" + + if not await path_exists(resolved.path): + return f"路径不存在: {base_path or '.'}" + + try: + matches = await asyncio.to_thread( + collect_allowed_glob_matches, + resolved.repo_root, + resolved.path, + pattern, + max_results, + ) + except ValueError as exc: + return f"glob 模式无效: {exc}" + + if not matches: + return f"未找到匹配的文件: {pattern}" + result = "\n".join(matches) + if len(matches) >= max_results: + result += f"\n\n... (结果已截断,共显示 {max_results} 条)" + return result diff --git a/src/Undefined/skills/agents/undefined_self_code_agent/tools/list_directory/config.json b/src/Undefined/skills/agents/undefined_self_code_agent/tools/list_directory/config.json new file mode 100644 index 00000000..5864e96b --- /dev/null +++ b/src/Undefined/skills/agents/undefined_self_code_agent/tools/list_directory/config.json @@ -0,0 +1,21 @@ +{ + "type": "function", + "function": { + "name": "list_directory", + "description": "列出 Undefined 仓库允许范围内的目录内容。空路径列出允许访问的顶层范围。", + "parameters": { + "type": "object", + "properties": { + "path": { + "type": "string", + "description": "目录路径,仓库相对路径;为空时列出允许访问范围" + }, + "max_entries": { + "type": "integer", + "description": "最大返回条目数,默认 120,范围 1-500" + } + }, + "required": [] + } + } +} diff --git a/src/Undefined/skills/agents/undefined_self_code_agent/tools/list_directory/handler.py b/src/Undefined/skills/agents/undefined_self_code_agent/tools/list_directory/handler.py new file mode 100644 index 00000000..9502d4cf --- /dev/null +++ b/src/Undefined/skills/agents/undefined_self_code_agent/tools/list_directory/handler.py @@ -0,0 +1,59 @@ +from __future__ import annotations + +from typing import Any + +from Undefined.skills.agents.undefined_self_code_agent.tools._shared import ( + ALLOWED_DIRECTORIES, + ALLOWED_ROOT_FILES, + allowed_roots_text, + clamp_int, + list_allowed_directory_entries, + path_exists, + path_is_dir, + resolve_allowed_path, +) + + +async def execute(args: dict[str, Any], context: dict[str, Any]) -> str: + """列出允许范围内的目录内容。""" + + path_arg = str(args.get("path") or "").strip() + max_entries = clamp_int(args.get("max_entries"), 120, 1, 500) + + try: + resolved = resolve_allowed_path(path_arg, context, allow_root=True) + except PermissionError as exc: + return f"权限不足:{exc}。{allowed_roots_text()}" + except ValueError as exc: + return f"错误:{exc}" + + if not path_arg: + lines = ["允许访问范围:"] + lines.extend(f"📁 {name}/" for name in ALLOWED_DIRECTORIES) + lines.extend(f"📄 {name}" for name in ALLOWED_ROOT_FILES) + return "\n".join(lines) + + if not await path_exists(resolved.path): + return f"目录不存在: {path_arg}" + if not await path_is_dir(resolved.path): + return f"错误:{path_arg} 不是目录" + + entries = await list_allowed_directory_entries( + resolved.repo_root, + resolved.path, + ) + visible: list[str] = [] + for rel, is_dir in entries: + icon = "📁" if is_dir else "📄" + suffix = "/" if is_dir else "" + visible.append(f"{icon} {rel}{suffix}") + if len(visible) >= max_entries: + break + + if not visible: + return f"{resolved.rel_path or '.'}: 无可列出的允许内容" + + total_visible = len(entries) + if total_visible > len(visible): + visible.append(f"... 还有 {total_visible - len(visible)} 项") + return "\n".join(visible) diff --git a/src/Undefined/skills/agents/undefined_self_code_agent/tools/read_file/config.json b/src/Undefined/skills/agents/undefined_self_code_agent/tools/read_file/config.json new file mode 100644 index 00000000..91d49f38 --- /dev/null +++ b/src/Undefined/skills/agents/undefined_self_code_agent/tools/read_file/config.json @@ -0,0 +1,29 @@ +{ + "type": "function", + "function": { + "name": "read_file", + "description": "读取 Undefined 仓库允许范围内的文本文件。路径必须是仓库相对路径。", + "parameters": { + "type": "object", + "properties": { + "file_path": { + "type": "string", + "description": "文件路径,例如 src/Undefined/ai/client/setup.py 或 README.md" + }, + "max_chars": { + "type": "integer", + "description": "最大返回字符数,默认 60000,范围 1000-200000" + }, + "offset": { + "type": "integer", + "description": "可选:起始行号,从 1 开始" + }, + "limit": { + "type": "integer", + "description": "可选:读取行数,默认 200,范围 1-2000" + } + }, + "required": ["file_path"] + } + } +} diff --git a/src/Undefined/skills/agents/undefined_self_code_agent/tools/read_file/handler.py b/src/Undefined/skills/agents/undefined_self_code_agent/tools/read_file/handler.py new file mode 100644 index 00000000..e3e57403 --- /dev/null +++ b/src/Undefined/skills/agents/undefined_self_code_agent/tools/read_file/handler.py @@ -0,0 +1,86 @@ +from __future__ import annotations + +import asyncio +import logging +from typing import Any + +from Undefined.skills.agents.undefined_self_code_agent.tools._shared import ( + DEFAULT_LINE_LIMIT, + DEFAULT_MAX_CHARS, + allowed_roots_text, + clamp_int, + path_exists, + path_is_file, + read_text_file, + resolve_allowed_path, +) + +logger = logging.getLogger(__name__) + + +async def execute(args: dict[str, Any], context: dict[str, Any]) -> str: + """读取允许范围内的文本文件。""" + + file_path = str(args.get("file_path") or args.get("path") or "").strip() + if not file_path: + return "错误:file_path 不能为空" + + try: + resolved = resolve_allowed_path(file_path, context) + except PermissionError as exc: + return f"权限不足:{exc}。{allowed_roots_text()}" + except ValueError as exc: + return f"错误:{exc}" + + if not await path_exists(resolved.path): + return f"文件不存在: {file_path}" + if not await path_is_file(resolved.path): + return f"错误:{file_path} 不是文件" + + try: + content, truncated_bytes, size = await asyncio.to_thread( + read_text_file, resolved.path + ) + except UnicodeError: + return f"错误:{resolved.rel_path} 不是可读取的文本文件" + except OSError as exc: + logger.exception("读取文件失败: %s", resolved.rel_path) + return f"读取文件失败 {resolved.rel_path}: {exc}" + + offset_raw = args.get("offset") + limit_raw = args.get("limit") + line_window = offset_raw is not None or limit_raw is not None + header = f"=== {resolved.rel_path} ({size} bytes) ===" + + if line_window: + lines = content.splitlines() + total_lines = len(lines) + offset = clamp_int(offset_raw, 1, 1, max(total_lines, 1)) + limit = clamp_int(limit_raw, DEFAULT_LINE_LIMIT, 1, 2000) + start_idx = offset - 1 + selected = lines[start_idx : start_idx + limit] + end_line = start_idx + len(selected) + body = "\n".join(selected) + if total_lines == 0 or not selected: + range_header = f"{header}\n行 0-0/0(空文件)" + else: + range_header = f"{header}\n行 {offset}-{end_line}/{total_lines}" + if truncated_bytes: + range_header += "\n提示:文件因大小限制只读取了前一部分字节" + return f"{range_header}\n{body}" + + max_chars = clamp_int(args.get("max_chars"), DEFAULT_MAX_CHARS, 1000, 200000) + total_chars = len(content) + truncated_chars = total_chars > max_chars + if truncated_chars: + content = content[:max_chars] + + notes: list[str] = [] + if truncated_bytes: + notes.append("文件因大小限制只读取了前一部分字节") + if truncated_chars: + notes.append(f"内容共 {total_chars} 字符,已截断到前 {max_chars} 字符") + note_text = "\n".join(f"提示:{note}" for note in notes) + if note_text: + return f"{header}\n{note_text}\n{content}" + return f"{header}\n{content}" diff --git a/src/Undefined/skills/agents/undefined_self_code_agent/tools/search_file_content/config.json b/src/Undefined/skills/agents/undefined_self_code_agent/tools/search_file_content/config.json new file mode 100644 index 00000000..2c07ed33 --- /dev/null +++ b/src/Undefined/skills/agents/undefined_self_code_agent/tools/search_file_content/config.json @@ -0,0 +1,37 @@ +{ + "type": "function", + "function": { + "name": "search_file_content", + "description": "在 Undefined 仓库允许范围内搜索文本内容,支持普通字符串或正则。", + "parameters": { + "type": "object", + "properties": { + "pattern": { + "type": "string", + "description": "要搜索的字符串或正则表达式" + }, + "path": { + "type": "string", + "description": "可选搜索路径,仓库相对路径;为空时搜索全部允许范围" + }, + "include": { + "type": "string", + "description": "可选文件 glob 过滤,例如 *.py、src/Undefined/**/*.py" + }, + "is_regex": { + "type": "boolean", + "description": "pattern 是否按正则表达式处理,默认 false" + }, + "case_sensitive": { + "type": "boolean", + "description": "是否大小写敏感,默认 true" + }, + "max_matches": { + "type": "integer", + "description": "最大匹配行数,默认 100,范围 1-500" + } + }, + "required": ["pattern"] + } + } +} diff --git a/src/Undefined/skills/agents/undefined_self_code_agent/tools/search_file_content/handler.py b/src/Undefined/skills/agents/undefined_self_code_agent/tools/search_file_content/handler.py new file mode 100644 index 00000000..b91628d3 --- /dev/null +++ b/src/Undefined/skills/agents/undefined_self_code_agent/tools/search_file_content/handler.py @@ -0,0 +1,87 @@ +from __future__ import annotations + +import asyncio +import logging +import re +from typing import Any + +from Undefined.skills.agents.undefined_self_code_agent.tools._shared import ( + DEFAULT_MAX_MATCHES, + allowed_roots_text, + clamp_int, + compile_pattern, + format_relative, + iter_allowed_files, + path_matches_include, + path_exists, + read_text_file, + resolve_search_root, + trim_line, +) + +logger = logging.getLogger(__name__) + + +async def execute(args: dict[str, Any], context: dict[str, Any]) -> str: + """在允许范围内搜索文本内容。""" + + pattern = str(args.get("pattern") or "").strip() + if not pattern: + return "错误:pattern 不能为空" + + path_arg = str(args.get("path") or "").strip() + include = str(args.get("include") or "").strip() + is_regex = bool(args.get("is_regex", False)) + case_sensitive = bool(args.get("case_sensitive", True)) + max_matches = clamp_int(args.get("max_matches"), DEFAULT_MAX_MATCHES, 1, 500) + + try: + resolved = resolve_search_root(path_arg, context) + except PermissionError as exc: + return f"权限不足:{exc}。{allowed_roots_text()}" + except ValueError as exc: + return f"错误:{exc}" + + if not await path_exists(resolved.path): + return f"路径不存在: {path_arg or '.'}" + + try: + compiled = compile_pattern( + pattern, + is_regex=is_regex, + case_sensitive=case_sensitive, + ) + except re.error as exc: + return f"正则表达式错误: {exc}" + + matches: list[str] = [] + try: + async for file_path in iter_allowed_files(resolved.repo_root, resolved.path): + if not path_matches_include(file_path, resolved.repo_root, include): + continue + try: + text, _truncated, _size = await asyncio.to_thread( + read_text_file, + file_path, + ) + except (OSError, UnicodeError): + continue + rel = format_relative(file_path, resolved.repo_root) + for line_number, line in enumerate(text.splitlines(), start=1): + if compiled.search(line): + matches.append(f"{rel}:{line_number}:{trim_line(line.rstrip())}") + if len(matches) >= max_matches: + break + if len(matches) >= max_matches: + break + except Exception as exc: + logger.exception("搜索失败: %s", pattern) + return f"搜索失败: {exc}" + + if not matches: + return f"未找到匹配: {pattern}" + + result = "\n".join(matches) + if len(matches) >= max_matches: + result += f"\n\n... (结果已截断,共显示 {max_matches} 条匹配)" + return result diff --git a/src/Undefined/utils/io.py b/src/Undefined/utils/io.py index 8e77733d..e3a986fe 100644 --- a/src/Undefined/utils/io.py +++ b/src/Undefined/utils/io.py @@ -7,6 +7,7 @@ import shutil import tempfile import time +from collections.abc import AsyncIterator, Iterator from pathlib import Path from typing import Any, Optional @@ -185,6 +186,29 @@ async def is_dir(file_path: Path | str) -> bool: return await asyncio.to_thread(Path(file_path).is_dir) +def _list_directory_entries_sync(directory: Path) -> list[tuple[Path, bool]]: + return [(entry, entry.is_dir()) for entry in directory.iterdir()] + + +async def list_directory_entries(directory: Path | str) -> list[tuple[Path, bool]]: + """异步列出目录项及其目录标记。""" + return await asyncio.to_thread(_list_directory_entries_sync, Path(directory)) + + +def _next_rglob_file(iterator: Iterator[Path]) -> Path | None: + for path in iterator: + if path.is_file(): + return path + return None + + +async def iter_rglob_files(directory: Path | str) -> AsyncIterator[Path]: + """异步递归遍历目录下的普通文件。""" + iterator = Path(directory).rglob("*") + while path := await asyncio.to_thread(_next_rglob_file, iterator): + yield path + + async def delete_file(file_path: Path | str) -> bool: """异步删除指定文件 diff --git a/tests/test_naga_code_analysis_agent.py b/tests/test_naga_code_analysis_agent.py new file mode 100644 index 00000000..1ec17cfc --- /dev/null +++ b/tests/test_naga_code_analysis_agent.py @@ -0,0 +1,25 @@ +from __future__ import annotations + +import pytest + +from Undefined.skills.agents.naga_code_analysis_agent.tools.read_naga_intro import ( + handler as read_naga_intro_handler, +) + + +@pytest.mark.asyncio +async def test_read_naga_intro_mentions_current_naga_layout() -> None: + result = await read_naga_intro_handler.execute({}, {}) + + assert "README 标识版本 5.1.0" in result + assert "api_format" in result + assert "anthropic" in result + assert "apiserver/routes/" in result + assert "agentserver/dogtag/" in result + assert "OpenClaw" in result + assert "mcpserver/mcp_manager.py" in result + assert "skills/*/SKILL.md" in result + assert "guide_engine/" in result + assert "frontend/electron/modules/backend.ts" in result + assert "build.py" in result + assert "docs/build-windows.md" in result diff --git a/tests/test_system_prompt_constraints.py b/tests/test_system_prompt_constraints.py index 20ca2d8f..efa213f1 100644 --- a/tests/test_system_prompt_constraints.py +++ b/tests/test_system_prompt_constraints.py @@ -51,6 +51,18 @@ def test_naga_prompt_requires_scope_before_naga_analysis() -> None: assert "待范围收窄后再调用 `naga_code_analysis_agent`" in text +@pytest.mark.parametrize("path", PROMPT_PATHS) +def test_system_prompts_route_undefined_self_code_questions(path: Path) -> None: + text = path.read_text(encoding="utf-8") + + assert "undefined_self_code_agent" in text + assert ( + "需要查阅 Undefined 自身源码、测试、文档、资源、脚本、配置示例或 App 实现" + in text + ) + assert "仅可只读查阅 Undefined 自身代码,不能写代码或执行命令" in text + + @pytest.mark.parametrize("path", PROMPT_PATHS) def test_system_prompts_describe_webui_markdown_and_html_output(path: Path) -> None: text = path.read_text(encoding="utf-8") diff --git a/tests/test_undefined_self_code_agent.py b/tests/test_undefined_self_code_agent.py new file mode 100644 index 00000000..5b5581c3 --- /dev/null +++ b/tests/test_undefined_self_code_agent.py @@ -0,0 +1,265 @@ +from __future__ import annotations + +import json +from pathlib import Path +from typing import Any + +import pytest + +from Undefined.skills.agents import AgentRegistry +from Undefined.skills.agents.undefined_self_code_agent.tools.glob import ( + handler as glob_handler, +) +from Undefined.skills.agents.undefined_self_code_agent.tools.list_directory import ( + handler as list_handler, +) +from Undefined.skills.agents.undefined_self_code_agent.tools.read_file import ( + handler as read_handler, +) +from Undefined.skills.agents.undefined_self_code_agent.tools.search_file_content import ( + handler as search_handler, +) +from Undefined.utils import io as async_io + + +AGENT_DIR = ( + Path(__file__).resolve().parent.parent + / "src" + / "Undefined" + / "skills" + / "agents" + / "undefined_self_code_agent" +) + + +async def _make_repo(tmp_path: Path) -> Path: + root = tmp_path / "repo" + (root / "src" / "Undefined").mkdir(parents=True) + (root / "scripts").mkdir() + (root / "tests").mkdir() + (root / "res").mkdir() + (root / "docs").mkdir() + (root / "apps" / "undefined-chat" / "src").mkdir(parents=True) + (root / "data").mkdir() + (root / "logs").mkdir() + (root / "code" / "NagaAgent").mkdir(parents=True) + + await async_io.write_text(root / "pyproject.toml", "[project]\nname='x'\n") + await async_io.write_text(root / "README.md", "# Undefined\n") + await async_io.write_text(root / "CHANGELOG.md", "## Unreleased\n") + await async_io.write_text(root / "ARCHITECTURE.md", "AgentRegistry\n") + await async_io.write_text( + root / "config.toml.example", + "[models.agent]\nmodel_name = 'x'\n", + ) + await async_io.write_text( + root / "src" / "Undefined" / "main.py", + "def run() -> None:\n print('Undefined')\n", + ) + await async_io.write_text( + root / "src" / "Undefined" / ".hidden.py", + "hidden = True\n", + ) + await async_io.write_text(root / "scripts" / "tool.py", "print('tool')\n") + await async_io.write_text( + root / "tests" / "test_main.py", + "def test_main() -> None:\n assert True\n", + ) + await async_io.write_text(root / "res" / "prompt.txt", "prompt\n") + await async_io.write_text(root / "docs" / "usage.md", "usage docs\n") + await async_io.write_text( + root / "apps" / "undefined-chat" / "src" / "App.tsx", + "export const App = () => 'chat';\n", + ) + await async_io.write_text(root / "data" / "secret.txt", "secret\n") + await async_io.write_text(root / "logs" / "run.log", "log\n") + await async_io.write_text(root / "code" / "NagaAgent" / "main.py", "naga\n") + await async_io.write_text(root / ".env", "TOKEN=secret\n") + return root + + +def _context(root: Path) -> dict[str, Any]: + return {"repo_root": root} + + +def test_config_json_schema() -> None: + cfg: dict[str, Any] = json.loads((AGENT_DIR / "config.json").read_text("utf-8")) + function = cfg["function"] + + assert cfg["type"] == "function" + assert function["name"] == "undefined_self_code_agent" + assert function["parameters"]["required"] == ["prompt"] + assert "prompt" in function["parameters"]["properties"] + + +def test_agent_registry_loads_description_from_intro() -> None: + registry = AgentRegistry(AGENT_DIR.parent) + schema = { + item["function"]["name"]: item["function"]["description"] + for item in registry.get_agents_schema() + } + + assert "undefined_self_code_agent" in schema + assert "Undefined 自身代码查阅助手" in schema["undefined_self_code_agent"] + assert "只读查阅" in schema["undefined_self_code_agent"] + + +@pytest.mark.asyncio +async def test_read_file_allows_allowed_paths(tmp_path: Path) -> None: + root = await _make_repo(tmp_path) + + result = await read_handler.execute( + {"file_path": "src/Undefined/main.py"}, + _context(root), + ) + + assert "=== src/Undefined/main.py" in result + assert "def run() -> None" in result + + +@pytest.mark.asyncio +async def test_read_file_allows_config_example_root_file(tmp_path: Path) -> None: + root = await _make_repo(tmp_path) + + result = await read_handler.execute( + {"file_path": "config.toml.example"}, + _context(root), + ) + + assert "[models.agent]" in result + + +@pytest.mark.asyncio +@pytest.mark.parametrize( + "path", + [ + "pyproject.toml", + ".env", + "data/secret.txt", + "logs/run.log", + "code/NagaAgent/main.py", + "src/Undefined/.hidden.py", + "../outside.txt", + ], +) +async def test_read_file_rejects_disallowed_paths(tmp_path: Path, path: str) -> None: + root = await _make_repo(tmp_path) + + result = await read_handler.execute({"file_path": path}, _context(root)) + + assert "权限不足" in result + assert "允许目录" in result + + +@pytest.mark.asyncio +async def test_list_directory_root_only_lists_allowed_scope(tmp_path: Path) -> None: + root = await _make_repo(tmp_path) + + result = await list_handler.execute({}, _context(root)) + + assert "📁 src/" in result + assert "📄 README.md" in result + assert "data/" not in result + assert "pyproject.toml" not in result + + +@pytest.mark.asyncio +async def test_glob_only_returns_allowed_files(tmp_path: Path) -> None: + root = await _make_repo(tmp_path) + + result = await glob_handler.execute({"pattern": "**/*.py"}, _context(root)) + + assert "src/Undefined/main.py" in result + assert "scripts/tool.py" in result + assert "tests/test_main.py" in result + assert "code/NagaAgent/main.py" not in result + + +@pytest.mark.asyncio +async def test_glob_handles_allowed_root_files(tmp_path: Path) -> None: + root = await _make_repo(tmp_path) + + result = await glob_handler.execute({"pattern": "*.md"}, _context(root)) + + assert "README.md" in result + assert "CHANGELOG.md" in result + + +@pytest.mark.asyncio +async def test_glob_handles_recursive_pattern_for_allowed_root_files( + tmp_path: Path, +) -> None: + root = await _make_repo(tmp_path) + + result = await glob_handler.execute({"pattern": "**/*.md"}, _context(root)) + + assert "README.md" in result + assert "docs/usage.md" in result + + +@pytest.mark.asyncio +@pytest.mark.parametrize("pattern", ["../*.py", "/tmp/*.py", "src/../*.py"]) +async def test_glob_rejects_traversal_patterns( + tmp_path: Path, + pattern: str, +) -> None: + root = await _make_repo(tmp_path) + + result = await glob_handler.execute({"pattern": pattern}, _context(root)) + + assert "glob 模式无效" in result + + +@pytest.mark.asyncio +async def test_search_only_returns_allowed_files(tmp_path: Path) -> None: + root = await _make_repo(tmp_path) + + result = await search_handler.execute( + {"pattern": "secret", "case_sensitive": False}, + _context(root), + ) + + assert "data/secret.txt" not in result + assert ".env" not in result + assert "未找到匹配" in result + + +@pytest.mark.asyncio +async def test_search_can_find_allowed_content(tmp_path: Path) -> None: + root = await _make_repo(tmp_path) + + result = await search_handler.execute( + {"pattern": "Undefined", "path": "src", "include": "*.py"}, + _context(root), + ) + + assert "src/Undefined/main.py:2:" in result + + +@pytest.mark.asyncio +async def test_binary_file_is_rejected(tmp_path: Path) -> None: + root = await _make_repo(tmp_path) + binary = root / "src" / "Undefined" / "asset.bin" + await async_io.write_bytes(binary, b"\x00\x01\x02") + + result = await read_handler.execute( + {"file_path": "src/Undefined/asset.bin"}, + _context(root), + ) + + assert "不是可读取的文本文件" in result + + +@pytest.mark.asyncio +async def test_read_file_empty_line_window_has_valid_header(tmp_path: Path) -> None: + root = await _make_repo(tmp_path) + empty_path = root / "src" / "Undefined" / "empty.py" + await async_io.write_text(empty_path, "") + + result = await read_handler.execute( + {"file_path": "src/Undefined/empty.py", "offset": 1, "limit": 10}, + _context(root), + ) + + assert "行 0-0/0(空文件)" in result + assert "行 1-0/0" not in result