feat: add knowledge-graph and semantic-layer toolsets to OSS by sunilgattupalle · Pull Request #255 · harness/mcp-server

sunilgattupalle · 2026-05-23T00:31:35Z

Summary

Moves knowledge-graph toolset from mcpServerInternal to OSS: adds kg_queryable_type_summary, kg_grammar, and hql_query (validate/run execute actions)
Moves semantic-layer toolset from mcpServerInternal to OSS: adds kg_type and kg_related_type
Adds src/utils/strip-meta.ts shared utility (consolidates two identical inline copies)
Adds harness_batch_validate_hql and harness_batch_execute_hql tools (parallel fan-out for HQL)
Both toolsets are default-on; batch tools registered in registerAllTools()

Notes

A corresponding cleanup PR in mcpServerInternal follows after this is published as 3.1.0
Until the internal dep bump, mcpServerInternal keeps its own strip-meta.ts copy for data-connector.ts

Test plan

pnpm typecheck passes
pnpm test passes (1473/1473 tests)

🤖 Generated with Claude Code

Moves the Knowledge Graph query toolsets from mcpServerInternal into the public mcp-server package so all users get HQL query capabilities by default. - Add `knowledge-graph` toolset: kg_queryable_type_summary, kg_grammar, hql_query (validate/run) - Add `semantic-layer` toolset: kg_type, kg_related_type - Add `src/utils/strip-meta.ts` shared utility (consolidates duplicate inline copies) - Add `harness_batch_validate_hql` and `harness_batch_execute_hql` tools in hql-batch.ts - Register both toolsets in ALL_TOOLSETS (default-on) and batch tools in registerAllTools() Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> AI-Session-Id: c131f059-3ca1-4319-b21d-1162415e111e AI-Tool: claude-code AI-Model: unknown

CLAassistant · 2026-05-23T00:31:41Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

…lsets Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> AI-Session-Id: c131f059-3ca1-4319-b21d-1162415e111e AI-Tool: claude-code AI-Model: unknown

cursor

Stale comment

Found 2 architecture-standard issues.

src/tools/index.ts:34 registers the new HQL batch tools outside the registry/toolset path. On the built branch, a HARNESS_TOOLSETS=logs registry still exposes harness_batch_validate_hql and harness_batch_execute_hql, so toolset filtering no longer matches the actual public surface.

README.md:11 still documents the 11-tool consolidated architecture, but this branch now publishes 13 tools and does not document the new HQL batch tools or the new knowledge-graph / semantic-layer surfaces in the README tables.

Assumptions:

Treating the registry-backed generic-tool model and HARNESS_TOOLSETS filtering as intentional architecture constraints, per the repo guidance and README.

Change summary:

Reviewed the PR diff, then verified with pnpm typecheck, pnpm build, pnpm docs:check, and a post-build tool-registration probe.

_{Sent by Cursor Automation: Sunil On Demand Architecture Review}

- Replace standalone harness_batch_validate_hql/harness_batch_execute_hql tools with a `queries` batch parameter on harness_execute — keeps tool count at 11, routes through registry for HARNESS_READ_ONLY, elicitation, audit, and retry policy - Fix semantic-layer kg_type list extractor to return a compact-safe projection (identifier, name, category, kind, description) instead of full type objects; full field metadata remains on get — no skipCompact needed - Update kg_related_type get description; strip unused skipCompact flags Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> AI-Session-Id: c131f059-3ca1-4319-b21d-1162415e111e AI-Tool: claude-code AI-Model: unknown

cursor

Stale comment

Found 3 issues that I think should be fixed before this PR is considered architecture-compliant:

High: src/tools/hql-batch.ts adds standalone HQL batch-tool code, but it is not wired into the live registration surface, so the file is dead code on the shipped MCP server and drifts from the repo's consolidated-tool / registry-backed design.

Medium: src/registry/toolsets/semantic-layer.ts advertises a kg_type.list contract that the extractor does not actually return, and it points agents at kg_queryable_type, which is not a registered resource type.

Medium: src/registry/toolsets/semantic-layer.ts relies on a dcs_enrichment list signal, but default harness_list compaction strips that key, so agents cannot see the signal the new docs describe.

Open questions / assumptions:

Reviewed against live PR head ecd2018456156ca52a75073eae2812a6c2d34100.

If src/tools/hql-batch.ts is intentionally staged for a follow-up, I would keep it out of this PR until it is either registry-backed or fully integrated and tested.

Verification:

pnpm build

pnpm exec vitest run tests/registry/registry.test.ts tests/tools/tool-handlers.test.ts

Built-surface probe: HARNESS_TOOLSETS=logs still registers only the 11 consolidated tools (no harness_batch_* tools)

Built runtime probe: compactItems([{..., dcs_enrichment: true}]) drops dcs_enrichment

_{Sent by Cursor Automation: Sunil On Demand Architecture Review}

cursor · 2026-05-23T01:13:41Z

+
+// ─── Registration ────────────────────────────────────────────────────────────
+
+export function registerHqlBatchTools(server: McpServer, client: HarnessClient): void {


registerHqlBatchTools() is never wired into registerAllTools() / server startup on the current head, so the two harness_batch_* tools never appear on the live MCP surface. That leaves this whole file as dead code in the OSS build and also drifts from the repo's documented 11-tool, registry-backed architecture. If HQL batching is meant to ship, it should either go through the existing registry / harness_execute path or be integrated and tested end-to-end; otherwise it should stay out of this PR.

cursor · 2026-05-23T01:13:41Z

+        "config, or data model). List returns all types with field metadata including non-queryable types. " +
+        "Use for exploring the full data model. For HQL query building, use kg_queryable_type instead.",


This description is out of sync with the runtime contract in two ways: schemaTypesExtract() only returns summary fields (identifier, name, category, kind, description), not field metadata, and the HQL-oriented summary resource is kg_queryable_type_summary, not kg_queryable_type. Because harness_describe exposes this string directly, agents following it will both overestimate what kg_type.list returns and be pointed at a non-existent resource_type.

cursor · 2026-05-23T01:13:41Z

+        // Surface dcs_enrichment flag on relationship types so agents know
+        // to fetch full details via get before orchestrating DCS calls.
+        if (category === "relationship") {
+          const annotations = typeObj.annotations as Record<string, unknown>[] | undefined;
+          const hasDcsEnrichment = (annotations ?? []).some(
+            (a) => (a.key as string) === "dcs_enrichment",
+          );
+          if (hasDcsEnrichment) item.dcs_enrichment = true;


schemaTypesExtract() adds dcs_enrichment: true here, but default harness_list output immediately runs through compactItems(), and that whitelist drops dcs_enrichment. I verified on the built branch that compactItems([{..., dcs_enrichment: true}]) returns no dcs_enrichment key, so the signal this comment describes never reaches agents unless callers remember to set compact: false. If this flag is part of the agent contract, this resource needs skipCompact: true or the compact whitelist needs to preserve dcs_enrichment.

- Batch HQL: fail fast on HARNESS_READ_ONLY and policy errors before fan-out (preflight dispatch on first query surfaces registry/policy errors as hard failures rather than per-item success:false results) - Remove hql-batch.ts from git index (was deleted but still appeared in PR diff) - semantic-layer kg_type list: drop dcs_enrichment flag (not compact-safe, was stripped by harness_list compaction anyway — available on get) - semantic-layer kg_type description: clarify list returns compact summary, get returns full field metadata - README: fix architecture diagram (32→34 toolsets), add knowledge-graph and semantic-layer to toolset reference table Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> AI-Session-Id: c131f059-3ca1-4319-b21d-1162415e111e AI-Tool: claude-code AI-Model: unknown

cursor

Found 3 issues that still keep this short of Sunil's architecture bar:

harness_execute batch HQL drops the first query's real response (and its per-query options), so the first item in every batch returns incomplete data.
The KG discovery hints advertise a harness_get call shape that the public tool doesn't accept (type_id / kind as top-level args instead of resource_id + params), so harness_describe is teaching agents a broken contract.
The new queries schema regresses the repo's Zod 4 metadata rule by calling .describe(...).optional(), which strips the description agents rely on.

Verification: pnpm install --frozen-lockfile, pnpm typecheck, pnpm build, pnpm test (1473 passing), plus stdio startup smoke tests with default toolsets and HARNESS_TOOLSETS=logs.

Residual risk: there is still no dedicated coverage for the new KG/semantic-layer resources or the batch execute path, so these issues were not exercised by the current suite.

_{Sent by Cursor Automation: Sunil On Demand Architecture Review}

cursor · 2026-05-23T01:34:31Z

+          // committing to the full fan-out.
+          try {
+            const firstQuery = batchQueries[0]!;
+            await registry.dispatchExecute(client, "hql_query", args.action, { body: { query_string: firstQuery.query_string } }, auditCtxBatch, extra.signal);


This preflight call is executing the first query for real, but the response is discarded. The batch result later reports query 0 as just { query_string, success: true }, so the first item never returns its actual is_valid/errors or rows/columns/stats, and its timeout_ms / max_results are ignored as well. Can we capture and reuse this response (or do a non-executing preflight) before building results?

cursor · 2026-05-23T01:34:31Z

+      executeHint:
+        "1. Learn grammar: harness_get(resource_type='kg_grammar'). " +
+        "2. Discover types: harness_list(resource_type='kg_queryable_type_summary') — note the 'identifier' and 'kind' fields. " +
+        "3. Get fields per type: harness_get(resource_type='kg_type', type_id='<identifier>', kind='<kind>'). " +


This hint doesn't match the public harness_get contract. The tool only accepts resource_id plus params, not top-level type_id / kind, so agents following this example will either fail validation or silently drop the fields. Can we switch the example to something like harness_get(resource_type='kg_type', resource_id='<identifier>', params={ kind: '<kind>' }) and update the similar description above as well?

cursor · 2026-05-23T01:34:31Z

+            timeout_ms: z.number().optional().describe("Per-query timeout in milliseconds"),
+            max_results: z.number().optional().describe("Maximum rows to return"),
+          })
+        ).max(20).describe("Batch HQL queries — use with resource_type='hql_query' and action='validate' or 'run'. Fans out in parallel, returns per-query results.").optional(),


Small standards issue: this uses .describe(...).optional(). Per the repo's Zod 4 rule in CLAUDE.md / AGENTS.md, .describe() needs to be last or the description is lost on the optional wrapper. This means agents won't actually see the queries guidance here. ...max(20).optional().describe(...) would preserve it.

docs: update README counts for knowledge-graph and semantic-layer too…

b907710

…lsets Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> AI-Session-Id: c131f059-3ca1-4319-b21d-1162415e111e AI-Tool: claude-code AI-Model: unknown

cursor Bot reviewed May 23, 2026

View reviewed changes

Comment thread src/tools/index.ts Outdated

Comment thread README.md

cursor Bot reviewed May 23, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add knowledge-graph and semantic-layer toolsets to OSS#255

feat: add knowledge-graph and semantic-layer toolsets to OSS#255
sunilgattupalle wants to merge 4 commits into
mainfrom
feat/knowledge-graph-toolsets

sunilgattupalle commented May 23, 2026

Uh oh!

CLAassistant commented May 23, 2026

Uh oh!

cursor Bot left a comment •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

cursor Bot left a comment •

edited

Loading

Uh oh!

cursor Bot May 23, 2026

Uh oh!

cursor Bot May 23, 2026

Uh oh!

cursor Bot May 23, 2026

Uh oh!

cursor Bot left a comment

Uh oh!

cursor Bot May 23, 2026

Uh oh!

cursor Bot May 23, 2026

Uh oh!

cursor Bot May 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants


		// ─── Registration ────────────────────────────────────────────────────────────

		export function registerHqlBatchTools(server: McpServer, client: HarnessClient): void {

		"config, or data model). List returns all types with field metadata including non-queryable types. " +
		"Use for exploring the full data model. For HQL query building, use kg_queryable_type instead.",

Conversation

sunilgattupalle commented May 23, 2026

Summary

Notes

Test plan

Uh oh!

CLAassistant commented May 23, 2026

Uh oh!

cursor Bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

cursor Bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cursor Bot May 23, 2026

Choose a reason for hiding this comment

Uh oh!

cursor Bot May 23, 2026

Choose a reason for hiding this comment

Uh oh!

cursor Bot May 23, 2026

Choose a reason for hiding this comment

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor Bot May 23, 2026

Choose a reason for hiding this comment

Uh oh!

cursor Bot May 23, 2026

Choose a reason for hiding this comment

Uh oh!

cursor Bot May 23, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

cursor Bot left a comment •

edited

Loading

cursor Bot left a comment •

edited

Loading