Skip to content

Chaos: Adding a few MCP endpointSpec and some refactoring & fixes#218

Open
priyanshuagarwal-harness wants to merge 10 commits into
harness:mainfrom
priyanshuagarwal-harness:CHAOS-11443
Open

Chaos: Adding a few MCP endpointSpec and some refactoring & fixes#218
priyanshuagarwal-harness wants to merge 10 commits into
harness:mainfrom
priyanshuagarwal-harness:CHAOS-11443

Conversation

@priyanshuagarwal-harness
Copy link
Copy Markdown
Contributor

Description

  • Modified & improved existing Chaos MCP endpointSpecs
  • Descriptions are enriched for existing endpointSpecs
  • Added a few new Chaos related MCP EndpointSpecs like chaos_component_variable, chaos_experiment (create), discovered_network_map, etc.

Type of Change

  • Bug fix
  • New feature
  • Refactor
  • Documentation
  • Other

Checklist

  • Tests pass
  • Typecheck passes

… bodyBuilder coercion

- New resources: discovered_namespace, discovered_service (list-only)
- New extractor sdPageExtract with empty-result diagnostic hint
- Reshape chaosDRTestListExtract to consume { items, pagination } envelope
- Expand chaos_fault.list with 8 new filters (search, type, tags, category, etc.)
- Add coerceBody helper to defensively parse string-serialized request bodies
- Add include_all_secrets_accessible_at_scope filter to secret.list
- Remove redundant bodySchema blocks on 4 template execute actions
@CLAassistant
Copy link
Copy Markdown

CLAassistant commented May 18, 2026

CLA assistant check
All committers have signed the CLA.

…d discovered_network_map

- Rename resource: chaos_network_map -> chaos_application_map (old name kept as search alias)
- chaos_application_map.list switches GET -> POST; adds environment_id/infra_id/search/all/minimal filters
- chaos_application_map.get requires environment_id + infra_id (composite Mongo key)
- New extractor chaosAppMapPageExtract for the { data, page } envelope
- New resource: discovered_network_map (Service Discovery raw inventory)
- Expand chaos_experiment with target_network_map_ids, my_experiments, exclude_automation filters
- Document system-tag semantics (fault=, probe=, workload=, service=) in descriptions
- Add multi-step reasoning playbook to descToolsetChaos

Co-authored-by: Cursor <cursoragent@cursor.com>
Copy link
Copy Markdown
Contributor

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stale comment

Found 4 issues against the architecture standard.

  • High: src/registry/toolsets/chaos.ts now swallows malformed JSON in coerceBody(), which lets bad string bodies reach write actions with defaulted values instead of failing locally.
  • High: src/registry/toolsets/chaos.ts exposes chaos_application_map.get without validating the composite {environment_id, infra_id} key, even though the new description says missing either yields a backend 500.
  • Medium: src/registry/toolsets/chaos.ts overwrites explicit runtime_inputs when experiment_variables or tasks are also present, silently dropping caller data.
  • Medium: src/registry/toolsets/chaos.ts marks chaos_experiment.create.bodySchema.id as required even though the builder auto-generates it, so the agent-facing structured contract is stricter than runtime.

Assumptions:

  • None beyond the behavior reproduced from the PR head.

Change summary:

  • The Chaos expansion is useful and the branch builds/tests cleanly, but the four items above still miss Sunil’s fail-loud + metadata/runtime-alignment bar.

Verification:

  • pnpm build
  • pnpm test -- --runInBand tests/registry/chaos-experiment.test.ts (Vitest actually ran the full suite: 57 files / 1354 tests passed)
  • pnpm docs:check
  • Local repro scripts against origin/pr-218 confirmed the malformed-body, missing-app-map-scope, and runtime_inputs overwrite behaviors.
Open in Web View Automation 

Sent by Cursor Automation: Sunil On Demand Architecture Review

function coerceBody(input: Record<string, unknown>): Record<string, unknown> {
const raw = input.body ?? input;
if (typeof raw === "string") {
try { return JSON.parse(raw) as Record<string, unknown>; } catch { /* fall through */ }
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

coerceBody() currently swallows malformed JSON and keeps going as if the body were an object. I reproduced this on the PR head with registry.dispatchExecute(..., 'chaos_probe', 'enable', { body: '{bad json' }), and it still emitted a real write request with {"isEnabled": true}. For this repo's fail-loud standard, I think this helper needs to throw on parse failure instead of falling through.

bodySchema: {
description: descBodyExperimentCreate,
fields: [
{ name: "id", type: "string", required: true, description: descExperimentIdUUID },
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The builder below auto-generates id when it is absent, but the structured schema marks it required: true. That means harness_describe / elicitation will tell agents they must provide an ID even though runtime does not need one. Can we make this field optional in bodySchema so the metadata stays aligned with the actual request builder?

},
},
executeActions: {
run: {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assignment overwrites any caller-supplied runtime_inputs from lines 316-317. I reproduced it on the PR head: passing both runtime_inputs and experiment_variables drops the original runtime_inputs object completely. We should either merge the two representations or reject mixed usage, otherwise the shared-tool contract silently loses data.

path: `${CHAOS}/rest/v2/applicationmaps/{mapId}`,
operationPolicy: { risk: "read", retryPolicy: "safe" },
pathParams: { map_id: "mapId" },
queryParams: {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new description says environment_id + infra_id are effectively required here because the backend lookup is keyed on all three values, but there is still no local validation. I verified that registry.dispatch(client, 'chaos_application_map', 'get', { map_id: 'map-1' }) emits a request with neither query param, which pushes callers into the backend 500 path instead of failing locally with a clear missing-parameter error.

@priyanshuagarwal-harness priyanshuagarwal-harness changed the title Chaos 11443 Chaos: Adding a few MCP endpointSpec and some refactoring & fixes May 19, 2026
Copy link
Copy Markdown
Contributor

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stale comment

Found 3 architecture issues worth fixing before merge:

  1. chaos_dr_test.list now parses the wrong response envelope and will silently return empty results.
  2. coerceBody() swallows malformed JSON on mutation paths, so bad agent input can turn into defaulted writes instead of a clear validation error.
  3. chaos_application_map.get documents environment_id + infra_id as required but never validates them locally, so callers still hit the backend 500 described in the docs.

I also noticed CI already has a red build-and-test (22) shard (docs:check exited 1), but I stopped at the architecture blockers above.

Open in Web View Automation 

Sent by Cursor Automation: Sunil On Demand Architecture Review

const r = raw as { drtests?: unknown[] };
const items = r.drtests ?? [];
return { items, total: items.length };
const r = raw as { items?: unknown[]; pagination?: { totalItems?: number } };
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing this parser from { drtests: [...] } to { items, pagination } looks like a regression. The DR-test API docs still return drtests, and main was parsing that shape. With a real response like { drtests: [...] }, this now produces { items: [], total: 0 }, so chaos_dr_test.list silently empties out instead of failing loudly. If the backend contract hasn't changed, this extractor (and the updated descListDRTests pagination prose) should stay on the drtests envelope.

function coerceBody(input: Record<string, unknown>): Record<string, unknown> {
const raw = input.body ?? input;
if (typeof raw === "string") {
try { return JSON.parse(raw) as Record<string, unknown>; } catch { /* fall through */ }
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Swallowing JSON.parse failures here violates the fail-loud rule for write paths. A malformed string body now falls through as a plain string, and callers with defaults like chaos_probe.enable / verify still build valid mutation payloads ({ isEnabled: true }, { verify: true }) instead of rejecting bad input locally. Can we throw an actionable error for invalid JSON strings rather than catch {} and continue?

Comment on lines +2319 to +2320
environment_id: "environmentIdentifier",
infra_id: "infraId",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This spec already knows environment_id + infra_id are required for get (descGetApplicationMap says missing either one yields a backend mongo: no documents 500), but nothing here enforces that locally. Registry.dispatch() only checks required listFilterFields on list and required bodySchema fields on the built body, so harness_get(resource_type="chaos_application_map", resource_id=...) without both params still goes straight to that backend 500. I'd add a local guard/preflight so the contract fails loudly.

@priyanshuagarwal-harness priyanshuagarwal-harness marked this pull request as ready for review May 20, 2026 11:46
@priyanshuagarwal-harness priyanshuagarwal-harness marked this pull request as draft May 20, 2026 11:56
@priyanshuagarwal-harness priyanshuagarwal-harness marked this pull request as ready for review May 21, 2026 08:24
priyanshuagarwal-harness and others added 2 commits May 21, 2026 14:05
Co-authored-by: Cursor <cursoragent@cursor.com>
Runs pnpm docs:generate against the merged tree (PR chaos changes +
main's iacm + ai-evals sync) to update the README counts surfaced by
docs:check: 187 -> 191 resource types in narrative and the
architecture diagram. Toolset count (32 default), opt-in iacm (33 total),
and 31 prompt templates already match the regenerated values.

Co-authored-by: Cursor <cursoragent@cursor.com>
Copy link
Copy Markdown
Contributor

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found 3 architecture issues worth fixing before merge:

  1. High: chaosDRTestListExtract() now expects { items, pagination }, but the published GET /v3/dr-tests contract still returns drtests. On this head, chaosDRTestListExtract({ drtests:[{ identity:"dr1" }], pagination:{ totalItems:1 } }) returns {"items":[],"total":1}, so chaos_dr_test.list silently drops real results.
  2. High: the new shared coerceBody() helper swallows JSON parse failures and can turn malformed string bodies into real writes. Repro on this head: dispatchExecute("chaos_probe","verify",{ body:"{" }) emits { verify: true } instead of failing locally.
  3. Medium: chaos_application_map.get now documents environment_id + infra_id as required, but the resource still doesn’t validate them before dispatch. Repro on this head issues /applicationmaps/{id} with only org/project params, so callers still hit the backend error path the description warns about.

Assumptions:

  • Reviewed against the current PR head ab341469 from the contributor fork and the public Harness API docs for DR tests and application maps.

Change summary:

  • The README/docs refresh is now in sync, the POST + page.totalItems application-map list wiring matches the public API docs, and the focused local checks passed.

Verification:

  • pnpm build
  • pnpm docs:check
  • pnpm test tests/registry/chaos-probe.test.ts
  • Targeted node --input-type=module repros against the built registry/extractors for the three items above
Open in Web View Automation 

Sent by Cursor Automation: Sunil On Demand Architecture Review

Comment on lines 472 to +475
export const chaosDRTestListExtract = (raw: unknown): { items: unknown[]; total: number } => {
const r = raw as { drtests?: unknown[] };
const items = r.drtests ?? [];
return { items, total: items.length };
const r = raw as { items?: unknown[]; pagination?: { totalItems?: number } };
const items = r.items ?? [];
return { items, total: r.pagination?.totalItems ?? items.length };
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The published GET /v3/dr-tests contract still returns the list under drtests, not items. On this PR head, the built extractor reproduces the regression:

chaosDRTestListExtract({ drtests: [{ identity: "dr1" }], pagination: { totalItems: 1 } })
// => { items: [], total: 1 }

That means chaos_dr_test.list will silently drop real results, which misses the fail-loud / contract-alignment bar. Can we keep reading drtests here (and optionally pagination.totalItems) until the API contract actually changes?

Comment on lines +160 to +165
function coerceBody(input: Record<string, unknown>): Record<string, unknown> {
const raw = input.body ?? input;
if (typeof raw === "string") {
try { return JSON.parse(raw) as Record<string, unknown>; } catch { /* fall through */ }
}
return raw as Record<string, unknown>;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This helper improves the valid double-serialized case, but the empty catch means malformed JSON still falls through as a string. On mutation paths with defaults, that becomes a real write instead of a local validation error. Example on this head:

await reg.dispatchExecute(client, "chaos_probe", "verify", {
  probe_id: "probe-1",
  org_id: "org",
  project_id: "proj",
  body: "{",
});
// request body => { verify: true }

That misses Sunil's fail-loud standard. Could we rethrow when input.body is a string that fails JSON.parse(), rather than silently treating it as an object?

Comment on lines +2318 to +2320
queryParams: {
environment_id: "environmentIdentifier",
infra_id: "infraId",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new description correctly documents environment_id + infra_id as required for chaos_application_map.get, but the resource still doesn't enforce them locally. Repro on this head:

await reg.dispatch(client, "chaos_application_map", "get", {
  map_id: "map-1",
  org_id: "org",
  project_id: "proj",
});
// params => { organizationIdentifier: "org", projectIdentifier: "proj" }

So agents still fall through to the backend error path the description warns about. Can we add a local guard for the composite key before dispatch?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants