[MCP T11] ask MCP tool (NL\u2192Cypher via GraphRAG)

Phase 1 ticket T11. Depends on #650 (T3 fixture), #658 (T10 prompts), and transitively #657 (T9 GraphRAG init).

## Context

The `ask` tool is the strategic differentiator. None of the 5 competing code-graph MCP servers expose a natural-language query interface — they all top out at structural traversal. `ask` lets agents ask "what calls processPayment?" in English and get a grounded answer with the executed Cypher visible for transparency.

How it works (two LLM round-trips bracketing one Cypher query against FalkorDB):
1. **LLM #1** (cypher generation): question + ontology → Cypher
2. **FalkorDB**: execute Cypher → rows of nodes
3. **LLM #2** (QA synthesis): question + rows → natural-language answer

The graph itself never goes to the LLM — only the schema and query results — which is why this works on huge codebases.

All the GraphRAG plumbing already exists in `api/llm.py` (`_create_kg_agent` at api/llm.py:238-258, `_ask_sync` at api/llm.py:260-268, `ask` at api/llm.py:271-273). T9 extracted the construction; T10 set up the prompt seam. This ticket is the thin async MCP wrapper.

## Scope

In:
- New `api/mcp/tools/ask.py` registering the `ask` MCP tool:
  ```python
  @app.tool()
  async def ask(question: str, project: str | None = None, branch: str | None = None) -> dict:
      kg = get_or_create_kg(project or current_project_name(), branch or "_default")
      response = await asyncio.get_event_loop().run_in_executor(None, kg.ask, question)
      return {
          "answer": response.answer,
          "cypher_query": response.cypher_query,    # exposed for transparency
          "context_nodes": response.context_nodes,
      }
  ```
- The `cypher_query` field is required in the response — the design doc explicitly calls this out as a transparency requirement so the agent can inspect, learn, and debug.
- Tests in `tests/mcp/test_ask.py`:
  - **Unit test** with fully mocked `KnowledgeGraph`: assert response shape `{answer, cypher_query, context_nodes}`.
  - **Integration test** with mocked `LiteModel`: stub the model to return canned content for both LLM round-trips:
    1. First call (cypher gen): returns a known Cypher targeting the T3 fixture (e.g. `MATCH (n:Function {name:"service"})<-[:CALLS]-(c) RETURN c`).
    2. Real Cypher executes against the real fixture graph in FalkorDB, returning real nodes.
    3. Second call (QA synthesis): returns a canned answer string.
    4. Assert the response includes the executed cypher in `cypher_query` and the real nodes in `context_nodes`.
  - **Protocol round-trip**: tool registered and callable via stdio client.

Out:
- Real-LLM E2E (Phase 1.5 nightly with API-key secrets).
- Streaming responses.
- Multi-turn conversation memory (each `ask` is independent).
- Prompt iteration (Phase 1.5).

## Files to create / modify

- new `api/mcp/tools/ask.py`
- modified `api/mcp/server.py` if needed (auto-discovery of new tools)
- new `tests/mcp/test_ask.py`

## Acceptance criteria

- [ ] Tool registered with FastMCP and discoverable via `session.list_tools()`.
- [ ] Input schema: `question: str, project: str | None = None, branch: str | None = None`.
- [ ] Response shape: `{answer: str, cypher_query: str, context_nodes: list}`.
- [ ] Unit test asserts response shape with fully mocked `KnowledgeGraph`.
- [ ] Integration test:
  - Mocks `LiteModel` so neither LLM call hits a real provider.
  - The mocked Cypher-gen response contains real Cypher that executes against the T3 fixture in CI's FalkorDB.
  - Asserts that `response.context_nodes` contains real nodes from the fixture (not from the mock).
  - Asserts that `response.cypher_query` matches the mocked Cypher.
- [ ] Protocol round-trip test calls the tool via stdio client and asserts a non-error structured response.
- [ ] CI workflow #649 green.

## Dependencies

- Blocks: nothing in Phase 1.
- Blocked by: #650 (T3), #657 (T9), #658 (T10).

## Out of scope (do NOT do in this PR)

- Real-LLM smoke test (Phase 1.5).
- Streaming, multi-turn, or memory.
- Prompt iteration.
- Per-question caching (the `KnowledgeGraph` is cached per `(project, branch)` in T9; per-question caching is overkill).

## Notes for the implementer

- The async wrapper around the sync `kg.ask()` follows the same `run_in_executor` pattern as the existing `api/llm.py:271-273`.
- Auto-detect `project` from CWD when not provided (similar to T4's branch auto-detect — just use `Path.cwd().name`).
- Auto-detect `branch` from CWD when not provided (reuse the helper from T17).
- The mocked-LLM integration test is the key innovation here: it gives us real coverage of the Cypher execution path without needing API credentials. Pattern: `unittest.mock.patch("graphrag_sdk.models.litellm.LiteModel.ask", side_effect=[cypher_response, qa_response])`.
- Reuse the protocol-test helper from T4.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MCP T11] ask MCP tool (NL\u2192Cypher via GraphRAG) #659

Context

Scope

Files to create / modify

Acceptance criteria

Dependencies

Out of scope (do NOT do in this PR)

Notes for the implementer

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[MCP T11] ask MCP tool (NL\u2192Cypher via GraphRAG) #659

Description

Context

Scope

Files to create / modify

Acceptance criteria

Dependencies

Out of scope (do NOT do in this PR)

Notes for the implementer

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions