Phase L: MCP tool-use + capability chips + CI#5
Merged
dexwritescode merged 12 commits intomainfrom Apr 22, 2026
Merged
Conversation
Adds tool-use detection and multi-turn execution to the compute/service layers. No model weights change; behavior is identical when tool_cb is not provided. compute/: - LanguageModel: add ToolCall struct and four virtual methods (supports_tool_use, format_tool_system_prompt, detect_tool_call, format_tool_result) with safe no-op defaults - LlamaModel: implement all four for Qwen2.5/Qwen3, Llama-3.1+, and Mistral-tool families; family detected at load time via vocab probe - model_config: add Qwen3ForCausalLM architecture; model_type "qwen3" dispatches to LlamaModel - language_model factory: add "qwen3" to LlamaModel dispatch service/: - Add ToolCallCb = (ToolCall) → optional<string> typedef - generate_internal: replace single mdl->generate() call with a multi-turn tool loop (up to 5 turns); detects tool call mid-stream, stops generation, invokes callback, injects result via format_tool_result(), re-encodes context (no BOS on continuations) - build_prompt: handle qwen3 with the same ChatML template as qwen2 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Qwen3 applies learned RMSNorm to Q and K tensors per-head-dimension before RoPE. Without it attention dot-products are ungated and produce garbage output. Qwen2/Llama/Mistral weights have no q_norm/k_norm keys, so the probe is a no-op for those families. Tested: Qwen3-8B-4bit now produces coherent reasoning output (thinking mode). Qwen2.5-3B, Llama-3.1-8B, and all 115 compute_tests pass without regression. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Implements the Model Context Protocol (MCP) client layer across service and CLI. service/src/mcp/: - mcp_types.h: McpServerConfig, McpPermission, ToolDef - mcp_client.h/cpp: JSON-RPC 2.0 over stdio subprocess (fork+exec+pipes). Handles initialize handshake, tools/list, tools/call. SSE stubbed. - mcp_manager.h/cpp: aggregates tools across connected servers; routes tool calls with permission checks (AlwaysAsk/AllowSession/AlwaysAllow/ AlwaysDeny); persists server configs to ~/.neurons/mcp_servers.json and permissions to ~/.neurons/mcp_permissions.json; make_tool_call_cb() returns a ToolCallCb that slots directly into generate_internal()'s tool loop from L.1 service/: - NeuronsServiceImpl gains an McpManager member - generate_internal() auto-activates McpManager tools when servers are connected and the loaded model supports tool use (no explicit caller change needed) - Generate gRPC handler now delegates to generate_internal so it also benefits from tool use and avoids duplicate prompt-building cli/: - mcp add/remove/list/test subcommands - mcp sources shared with service via direct include (no new library) - Tested: add→list→remove round-trip + full protocol test against a Python MCP server (initialize handshake + tools/list listing 2 tools) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
OUTPUT_NAME in cli/CMakeLists.txt updated to "neurons" so the installed binary matches the name used in README examples and the project name. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ports_tool_use - Add supports_tool_use to LoadModelResponse (field 7) and StatusResponse (field 10) in proto - Set resp->set_supports_tool_use(model_->supports_tool_use()) in LoadModel + GetStatus handlers - Add AppState.supportsToolUse populated from LoadModel response and _applyStatus; cleared on unload - _ModelRow accepts modelType + supportsToolUse?; heuristic uses modelType when available; C++ value overrides chip after load - Fix leading-underscore lint on local RegExp vars in inferCapabilities() Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Builds all C++ targets (compute + CLI + service) on macos-26 arm64. Caches build/_deps so MLX only compiles from source on first run. Runs unit tests with integration tests excluded (those need model files). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… builds model_tests is only added in Debug builds (models/CMakeLists.txt line 65). The all-tests target unconditionally depended on it, causing a missing-target error in Release CI. Fix: make all-tests conditional on build type in the top-level CMakeLists, and switch CI to Debug so both compute_tests and model_tests are built. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The HttpInterface pure virtual method was added after the mock was written, breaking Debug builds. Delegates to requestSync — progress is irrelevant for unit tests. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Builds all-tests in Debug mode (build-debug/) and runs ctest with the same flags as CI (integration excluded, 120s timeout). Run this before pushing to catch test gaps that Release builds skip. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… log Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
ASSERT_TRUE causes a hard test failure when model files are absent. GTEST_SKIP marks the test as skipped, which is correct for CI where ~/.neurons/models/ doesn't exist. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add GTEST_SKIP guards to the 4 SimpleBpeTokenizer tests that load from tinyllama_model_dir without checking existence first - Add LABELS integration to model_integration_tests so --label-exclude integration in ctest actually filters it out in CI Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
neurons mcpCLI commands (add/remove/list/test)supports_tool_use()from C++; wired through proto → service → AppState → Flutter model pickermacos-26arm64: installs gRPC, cachesbuild/_deps(MLX etc.), builds all targets, runs unit testsTest plan
neurons mcp add/list/testcommands work against a local MCP server🤖 Generated with Claude Code