Phase L: MCP tool-use + capability chips + CI by dexwritescode · Pull Request #5 · dexwritescode/neurons

dexwritescode · 2026-04-21T23:15:24Z

Summary

L.1 Multi-turn tool-use loop in the inference engine (up to 5 turns, Qwen3 QK norm fix)
L.2 MCP client runtime — stdio transport, McpManager with permission gates, neurons mcp CLI commands (add/remove/list/test)
L.7 Tool capability chips — heuristic inference from model name/type, overridden post-load by supports_tool_use() from C++; wired through proto → service → AppState → Flutter model picker
CI GitHub Actions workflow on macos-26 arm64: installs gRPC, caches build/_deps (MLX etc.), builds all targets, runs unit tests

Test plan

CI workflow triggers on this PR and the Build step passes
neurons mcp add/list/test commands work against a local MCP server
Tool capability chip shows correctly for loaded vs unloaded models in the Models tab
Unit tests pass in CI (integration tests skipped — need model files)

🤖 Generated with Claude Code

Adds tool-use detection and multi-turn execution to the compute/service layers. No model weights change; behavior is identical when tool_cb is not provided. compute/: - LanguageModel: add ToolCall struct and four virtual methods (supports_tool_use, format_tool_system_prompt, detect_tool_call, format_tool_result) with safe no-op defaults - LlamaModel: implement all four for Qwen2.5/Qwen3, Llama-3.1+, and Mistral-tool families; family detected at load time via vocab probe - model_config: add Qwen3ForCausalLM architecture; model_type "qwen3" dispatches to LlamaModel - language_model factory: add "qwen3" to LlamaModel dispatch service/: - Add ToolCallCb = (ToolCall) → optional<string> typedef - generate_internal: replace single mdl->generate() call with a multi-turn tool loop (up to 5 turns); detects tool call mid-stream, stops generation, invokes callback, injects result via format_tool_result(), re-encodes context (no BOS on continuations) - build_prompt: handle qwen3 with the same ChatML template as qwen2 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Qwen3 applies learned RMSNorm to Q and K tensors per-head-dimension before RoPE. Without it attention dot-products are ungated and produce garbage output. Qwen2/Llama/Mistral weights have no q_norm/k_norm keys, so the probe is a no-op for those families. Tested: Qwen3-8B-4bit now produces coherent reasoning output (thinking mode). Qwen2.5-3B, Llama-3.1-8B, and all 115 compute_tests pass without regression. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Implements the Model Context Protocol (MCP) client layer across service and CLI. service/src/mcp/: - mcp_types.h: McpServerConfig, McpPermission, ToolDef - mcp_client.h/cpp: JSON-RPC 2.0 over stdio subprocess (fork+exec+pipes). Handles initialize handshake, tools/list, tools/call. SSE stubbed. - mcp_manager.h/cpp: aggregates tools across connected servers; routes tool calls with permission checks (AlwaysAsk/AllowSession/AlwaysAllow/ AlwaysDeny); persists server configs to ~/.neurons/mcp_servers.json and permissions to ~/.neurons/mcp_permissions.json; make_tool_call_cb() returns a ToolCallCb that slots directly into generate_internal()'s tool loop from L.1 service/: - NeuronsServiceImpl gains an McpManager member - generate_internal() auto-activates McpManager tools when servers are connected and the loaded model supports tool use (no explicit caller change needed) - Generate gRPC handler now delegates to generate_internal so it also benefits from tool use and avoids duplicate prompt-building cli/: - mcp add/remove/list/test subcommands - mcp sources shared with service via direct include (no new library) - Tested: add→list→remove round-trip + full protocol test against a Python MCP server (initialize handshake + tools/list listing 2 tools) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

OUTPUT_NAME in cli/CMakeLists.txt updated to "neurons" so the installed binary matches the name used in README examples and the project name. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…ports_tool_use - Add supports_tool_use to LoadModelResponse (field 7) and StatusResponse (field 10) in proto - Set resp->set_supports_tool_use(model_->supports_tool_use()) in LoadModel + GetStatus handlers - Add AppState.supportsToolUse populated from LoadModel response and _applyStatus; cleared on unload - _ModelRow accepts modelType + supportsToolUse?; heuristic uses modelType when available; C++ value overrides chip after load - Fix leading-underscore lint on local RegExp vars in inferCapabilities() Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Builds all C++ targets (compute + CLI + service) on macos-26 arm64. Caches build/_deps so MLX only compiles from source on first run. Runs unit tests with integration tests excluded (those need model files). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

… builds model_tests is only added in Debug builds (models/CMakeLists.txt line 65). The all-tests target unconditionally depended on it, causing a missing-target error in Release CI. Fix: make all-tests conditional on build type in the top-level CMakeLists, and switch CI to Debug so both compute_tests and model_tests are built. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

The HttpInterface pure virtual method was added after the mock was written, breaking Debug builds. Delegates to requestSync — progress is irrelevant for unit tests. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Builds all-tests in Debug mode (build-debug/) and runs ctest with the same flags as CI (integration excluded, 120s timeout). Run this before pushing to catch test gaps that Release builds skip. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

… log Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

ASSERT_TRUE causes a hard test failure when model files are absent. GTEST_SKIP marks the test as skipped, which is correct for CI where ~/.neurons/models/ doesn't exist. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Add GTEST_SKIP guards to the 4 SimpleBpeTokenizer tests that load from tinyllama_model_dir without checking existence first - Add LABELS integration to model_integration_tests so --label-exclude integration in ctest actually filters it out in CI Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

dexwritescode and others added 12 commits April 21, 2026 11:47

rename CLI binary: cli → neurons

f927532

OUTPUT_NAME in cli/CMakeLists.txt updated to "neurons" so the installed binary matches the name used in README examples and the project name. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

build: parametrize MLX version — single source of truth for GIT_TAG +…

b542c30

… log Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

dexwritescode merged commit 7c35d62 into main Apr 22, 2026
1 check passed

dexwritescode deleted the phase-l-mcp branch April 22, 2026 00:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Phase L: MCP tool-use + capability chips + CI#5

Phase L: MCP tool-use + capability chips + CI#5
dexwritescode merged 12 commits intomainfrom
phase-l-mcp

dexwritescode commented Apr 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

dexwritescode commented Apr 21, 2026

Summary

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant