Skip to content

[MCP T11] ask MCP tool (NL\u2192Cypher via GraphRAG) #659

@DvirDukhan

Description

@DvirDukhan

Phase 1 ticket T11. Depends on #650 (T3 fixture), #658 (T10 prompts), and transitively #657 (T9 GraphRAG init).

Context

The ask tool is the strategic differentiator. None of the 5 competing code-graph MCP servers expose a natural-language query interface — they all top out at structural traversal. ask lets agents ask "what calls processPayment?" in English and get a grounded answer with the executed Cypher visible for transparency.

How it works (two LLM round-trips bracketing one Cypher query against FalkorDB):

  1. LLM Add client side validation for URL #1 (cypher generation): question + ontology → Cypher
  2. FalkorDB: execute Cypher → rows of nodes
  3. LLM Error on Attribute 'name' is already indexed #2 (QA synthesis): question + rows → natural-language answer

The graph itself never goes to the LLM — only the schema and query results — which is why this works on huge codebases.

All the GraphRAG plumbing already exists in api/llm.py (_create_kg_agent at api/llm.py:238-258, _ask_sync at api/llm.py:260-268, ask at api/llm.py:271-273). T9 extracted the construction; T10 set up the prompt seam. This ticket is the thin async MCP wrapper.

Scope

In:

  • New api/mcp/tools/ask.py registering the ask MCP tool:
    @app.tool()
    async def ask(question: str, project: str | None = None, branch: str | None = None) -> dict:
        kg = get_or_create_kg(project or current_project_name(), branch or "_default")
        response = await asyncio.get_event_loop().run_in_executor(None, kg.ask, question)
        return {
            "answer": response.answer,
            "cypher_query": response.cypher_query,    # exposed for transparency
            "context_nodes": response.context_nodes,
        }
  • The cypher_query field is required in the response — the design doc explicitly calls this out as a transparency requirement so the agent can inspect, learn, and debug.
  • Tests in tests/mcp/test_ask.py:
    • Unit test with fully mocked KnowledgeGraph: assert response shape {answer, cypher_query, context_nodes}.
    • Integration test with mocked LiteModel: stub the model to return canned content for both LLM round-trips:
      1. First call (cypher gen): returns a known Cypher targeting the T3 fixture (e.g. MATCH (n:Function {name:"service"})<-[:CALLS]-(c) RETURN c).
      2. Real Cypher executes against the real fixture graph in FalkorDB, returning real nodes.
      3. Second call (QA synthesis): returns a canned answer string.
      4. Assert the response includes the executed cypher in cypher_query and the real nodes in context_nodes.
    • Protocol round-trip: tool registered and callable via stdio client.

Out:

  • Real-LLM E2E (Phase 1.5 nightly with API-key secrets).
  • Streaming responses.
  • Multi-turn conversation memory (each ask is independent).
  • Prompt iteration (Phase 1.5).

Files to create / modify

  • new api/mcp/tools/ask.py
  • modified api/mcp/server.py if needed (auto-discovery of new tools)
  • new tests/mcp/test_ask.py

Acceptance criteria

  • Tool registered with FastMCP and discoverable via session.list_tools().
  • Input schema: question: str, project: str | None = None, branch: str | None = None.
  • Response shape: {answer: str, cypher_query: str, context_nodes: list}.
  • Unit test asserts response shape with fully mocked KnowledgeGraph.
  • Integration test:
    • Mocks LiteModel so neither LLM call hits a real provider.
    • The mocked Cypher-gen response contains real Cypher that executes against the T3 fixture in CI's FalkorDB.
    • Asserts that response.context_nodes contains real nodes from the fixture (not from the mock).
    • Asserts that response.cypher_query matches the mocked Cypher.
  • Protocol round-trip test calls the tool via stdio client and asserts a non-error structured response.
  • CI workflow [MCP T2] CI workflow with FalkorDB service for MCP tests #649 green.

Dependencies

Out of scope (do NOT do in this PR)

  • Real-LLM smoke test (Phase 1.5).
  • Streaming, multi-turn, or memory.
  • Prompt iteration.
  • Per-question caching (the KnowledgeGraph is cached per (project, branch) in T9; per-question caching is overkill).

Notes for the implementer

  • The async wrapper around the sync kg.ask() follows the same run_in_executor pattern as the existing api/llm.py:271-273.
  • Auto-detect project from CWD when not provided (similar to T4's branch auto-detect — just use Path.cwd().name).
  • Auto-detect branch from CWD when not provided (reuse the helper from T17).
  • The mocked-LLM integration test is the key innovation here: it gives us real coverage of the Cypher execution path without needing API credentials. Pattern: unittest.mock.patch("graphrag_sdk.models.litellm.LiteModel.ask", side_effect=[cypher_response, qa_response]).
  • Reuse the protocol-test helper from T4.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestmcpMCP server (model context protocol) work

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions