From 2ec4c3c6e63e428ff44d2ad9d697f50906ab9bc3 Mon Sep 17 00:00:00 2001
From: John Chilton <jmchilton@gmail.com>
Date: Wed, 17 Jun 2026 17:43:53 -0400
Subject: [PATCH] discover-shed-tool: normalize tool-id-shaped queries before
 declaring miss
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

A tool-id token (`integron_finder`, `iuc/integron_finder`) — what a template
`_plan_context` candidate hands the discover step — scores zero in `tool-search`
while the human name (`integron finder`) hits the real `iuc/integron_finder`
wrapper. Without normalization the loop emits a false `miss` and wrongly
fall-throughs to authoring a tool that already exists.

- index.md §1: strip `owner/`, split `_`/`-` into words, try human name + bare
  word; `miss` only honest after variants tried. Strengthen the caveat.
- eval.md: add property "a miss survives query-variant normalization".
- re-cast claude bundle (verify clean).
- refinement journals: discover-shed-tool (applied fix) + advance-galaxy-draft-step
  (the test-pipeline finding; upstream _plan_context-hint question stays open).

Surfaced by /test-pipeline interview-to-galaxy UC1 MRSA re-run.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 .../claude/skills/discover-shed-tool/SKILL.md |  4 +-
 .../discover-shed-tool/_provenance.json       |  6 +-
 ...6-06-17-uc1-mrsa-interview-galaxy-rerun.md | 56 +++++++++++++++++++
 content/molds/discover-shed-tool/eval.md      | 11 ++++
 content/molds/discover-shed-tool/index.md     |  4 +-
 .../2026-06-17-tool-id-query-normalization.md | 42 ++++++++++++++
 6 files changed, 118 insertions(+), 5 deletions(-)
 create mode 100644 content/molds/advance-galaxy-draft-step/refinements/2026-06-17-uc1-mrsa-interview-galaxy-rerun.md
 create mode 100644 content/molds/discover-shed-tool/refinements/2026-06-17-tool-id-query-normalization.md
diff --git a/casts/claude/skills/discover-shed-tool/SKILL.md b/casts/claude/skills/discover-shed-tool/SKILL.md
index b5a42d9..72ea725 100644
--- a/casts/claude/skills/discover-shed-tool/SKILL.md
+++ b/casts/claude/skills/discover-shed-tool/SKILL.md
@@ -100,6 +100,8 @@ gxwf tool-search "<keywords>" --json --max-results 10
 
 If an owner hint is present, add `--owner <owner>`. If an exact-name hint is present, add `--match-name`. Lowercase the query (the tool index does not lowercase, see component-tool-shed-search §6).
 
+**Normalize a tool-id-shaped need before searching.** A caller (e.g. a template `_plan_context` that guessed a candidate) may hand this skill an XML-id token rather than a human name — `iuc/integron_finder`, `integron_finder`. The lexical index does **not** reliably match the underscored token: `tool-search integron_finder` returns no hits while `integron finder` and `integron` both score. So derive query variants instead of feeding the token verbatim: strip any `owner/` prefix, split on `_` / `-` into space-separated words, and also try the bare significant word. A `miss` is only honest after the name variants have been tried — a no-hit on the raw underscored token alone is a search artifact, not evidence the tool is absent.
+
 #### 2. Triage hits
 
 For each hit, score on:
@@ -144,7 +146,7 @@ Validate the recommendation with `validate-galaxy-tool-discovery` before returni
 The procedure assumes — and the skill must surface in its rationale when relevant — the following Tool Shed realities (full detail in component-tool-shed-search §6):
 
 - **Indexes are stale by design.** A freshly published tool may not appear; a deprecated tool may still appear. Treat absence as soft evidence, not proof.
-- **Wildcard `*term*` wrapping** disables stemming; spelling matters. Try alternate phrasings before declaring `miss`.
+- **Wildcard `*term*` wrapping** disables stemming; spelling matters. Try alternate phrasings before declaring `miss`. In particular an underscored or owner-prefixed **tool-id token** (`integron_finder`, `iuc/integron_finder`) can score zero where the **human name** (`integron finder`) hits — never declare `miss` on a single id-token query (see §1 normalization).
 - **No EDAM in shed search** — semantic queries that work in Galaxy's installed-toolbox search will not work here. Stick to lexical name/keyword queries.
 - **Same XML id across repos.** Hits collapse only on `(repoName, owner)`; expect duplicates that need triage.
 - **Repo-level discovery is a different surface.** For "find me the *package* that contains a tool about X" with server-side `owner:` / `category:` keywords, `gxwf repo-search` is the right command — out of scope for this skill but a known sibling.
diff --git a/casts/claude/skills/discover-shed-tool/_provenance.json b/casts/claude/skills/discover-shed-tool/_provenance.json
index 4c8dfde..cf1e68c 100644
--- a/casts/claude/skills/discover-shed-tool/_provenance.json
+++ b/casts/claude/skills/discover-shed-tool/_provenance.json
@@ -5,10 +5,10 @@
     "name": "discover-shed-tool",
     "path": "content/molds/discover-shed-tool/index.md",
     "revision": 4,
-    "content_hash": "256e3861e4fd54e5afe1db0cb99954523faea71c06d46b32b2442d39fb646056",
-    "commit": "4e21bc4ffe429471256a2d268127e2a807769fa8"
+    "content_hash": "2e5fa1f8b22f1f9c984fa4c4dc05999294de5db706a9c1691d3af7b3459cbecb",
+    "commit": "80a19256fffab433c59f19940f1464d2e7a8779b"
   },
-  "cast_at": "2026-06-17T19:52:17.013Z",
+  "cast_at": "2026-06-17T21:07:28.232Z",
   "cast_date": "2026-05-07",
   "cast_revision": 5,
   "cast_history": [
diff --git a/content/molds/advance-galaxy-draft-step/refinements/2026-06-17-uc1-mrsa-interview-galaxy-rerun.md b/content/molds/advance-galaxy-draft-step/refinements/2026-06-17-uc1-mrsa-interview-galaxy-rerun.md
new file mode 100644
index 0000000..f0d6d84
--- /dev/null
+++ b/content/molds/advance-galaxy-draft-step/refinements/2026-06-17-uc1-mrsa-interview-galaxy-rerun.md
@@ -0,0 +1,56 @@
+---
+mold: advance-galaxy-draft-step
+date: 2026-06-17
+intent: Fresh /test-pipeline re-run of interview-to-galaxy on the UC1 MRSA mobile-AMR case; drive the per-step loop's discover/resolve half against the live Tool Shed.
+decision: open-question
+---
+
+## What worked
+
+- The loop oracles behaved exactly to spec on a real 15-step draft:
+  `draft-next-step` deterministically picked `integron_finder` (stable across
+  re-runs, `work` listed the step's TODO sentinels + `_plan_*`);
+  `draft-validate --concrete` → `draft valid` + `Concrete: OK` with the one
+  Resolved step's tool_state validated and the 14 drafty steps excluded from the
+  concrete projection; `draft-extract` returned the maximal concrete prefix
+  (staramr only, dependents dropped).
+- The discover/resolve half resolved the named domain tools against the live
+  Tool Shed to real pinnable changesets: `iuc/isescan/isescan` @ `81539b9ae80a`
+  (1.7.3+galaxy0, newest) and `iuc/integron_finder/integron_finder` @
+  `5429646e486d` (2.0.5+galaxy1, newest).
+
+## Gap that surfaced — discover query normalization
+
+`gxwf tool-search integron_finder` (the verbatim **tool-id token**, which is what
+the template wrote into the step's `_plan_context` candidate —
+"candidate iuc/integron_finder") returns **`No hits for query: integron_finder`**.
+Only the **human-name** variants score: `integron finder` (34.60),
+`integronfinder` (15.97), `integron` (51.33). ISEScan happened to score on its
+bare token (`isescan` → 27.78) so it masked the issue.
+
+Concretely: a discover-shed-tool invocation that feeds the raw `_plan_context`
+candidate string (`integron_finder`, an underscored tool-id token) straight into
+`tool-search` would conclude "no acceptable shed candidate" and fall through to
+`author-galaxy-tool-wrapper` — authoring a wrapper for a tool that **already
+exists** in `iuc`. That's a false-negative discovery, the expensive wrong branch.
+
+## Open question
+
+Where should query normalization live, and what should it do?
+
+- The loop's step 2 ("Resolve a wrapper… run discover-shed-tool against the
+  step's `_plan_*` context") implies the query is derived from `_plan_context`.
+  If templates write tool-id-shaped candidates (`iuc/integron_finder`), discovery
+  needs to **variant-expand**: strip owner prefix, split on `_`/`-`, try the
+  human name, not just the literal token.
+- Alternatively the template Mold should record the **human tool name** in
+  `_plan_context` (e.g. "Integron Finder") alongside or instead of the tool-id
+  guess, since the id token is exactly what a fresh search should not assume.
+- Either way the eval property "no acceptable shed candidate falls through to
+  authoring" has a hidden precondition — that the *query* was a fair search.
+  A single-token search that misses an existing tool is a discovery bug
+  masquerading as a legitimate fallthrough. Worth an eval note or a
+  discover-shed-tool refinement on query construction.
+
+Not changing schema or references here — flagging the query-construction contract
+between template `_plan_context`, the loop's discover step, and discover-shed-tool.
diff --git a/content/molds/discover-shed-tool/eval.md b/content/molds/discover-shed-tool/eval.md
index 3dd4efc..b8fd415 100644
--- a/content/molds/discover-shed-tool/eval.md
+++ b/content/molds/discover-shed-tool/eval.md
@@ -19,3 +19,14 @@ need it ran on. Concrete fixtures and their expected pins live in
   or similarly named repositories, the recommendation classifies the result as
   `weak`, explains the ambiguity, and surfaces alternates instead of silently
   pinning a low-confidence hit.
+
+## Property: a miss survives query-variant normalization
+
+- check: llm-judged
+- assertion: a `miss` is not emitted on the strength of a single raw query when
+  the need is a tool-id-shaped token (underscored or `owner/`-prefixed). Before
+  falling through to authoring, the run tries lexical variants — strip the
+  `owner/` prefix, split `_`/`-` into words, try the bare significant word — so a
+  wrapper that exists under its human name (e.g. `integron finder` for an
+  `integron_finder` token) is found rather than mistaken for absent. A `miss`
+  that an obvious name-variant would have turned into a hit is a failure.
diff --git a/content/molds/discover-shed-tool/index.md b/content/molds/discover-shed-tool/index.md
index 052393f..1d6706f 100644
--- a/content/molds/discover-shed-tool/index.md
+++ b/content/molds/discover-shed-tool/index.md
@@ -122,6 +122,8 @@ gxwf tool-search "<keywords>" --json --max-results 10
 
 If an owner hint is present, add `--owner <owner>`. If an exact-name hint is present, add `--match-name`. Lowercase the query (the tool index does not lowercase, see [[component-tool-shed-search]] §6).
 
+**Normalize a tool-id-shaped need before searching.** A caller (e.g. a template `_plan_context` that guessed a candidate) may hand this Mold an XML-id token rather than a human name — `iuc/integron_finder`, `integron_finder`. The lexical index does **not** reliably match the underscored token: `tool-search integron_finder` returns no hits while `integron finder` and `integron` both score. So derive query variants instead of feeding the token verbatim: strip any `owner/` prefix, split on `_` / `-` into space-separated words, and also try the bare significant word. A `miss` is only honest after the name variants have been tried — a no-hit on the raw underscored token alone is a search artifact, not evidence the tool is absent.
+
 ### 2. Triage hits
 
 For each hit, score on:
@@ -166,7 +168,7 @@ Validate the recommendation with `validate-galaxy-tool-discovery` before returni
 The procedure assumes — and the cast skill must surface in its rationale when relevant — the following Tool Shed realities (full detail in [[component-tool-shed-search]] §6):
 
 - **Indexes are stale by design.** A freshly published tool may not appear; a deprecated tool may still appear. Treat absence as soft evidence, not proof.
-- **Wildcard `*term*` wrapping** disables stemming; spelling matters. Try alternate phrasings before declaring `miss`.
+- **Wildcard `*term*` wrapping** disables stemming; spelling matters. Try alternate phrasings before declaring `miss`. In particular an underscored or owner-prefixed **tool-id token** (`integron_finder`, `iuc/integron_finder`) can score zero where the **human name** (`integron finder`) hits — never declare `miss` on a single id-token query (see §1 normalization).
 - **No EDAM in shed search** — semantic queries that work in Galaxy's installed-toolbox search will not work here. Stick to lexical name/keyword queries.
 - **Same XML id across repos.** Hits collapse only on `(repoName, owner)`; expect duplicates that need triage.
 - **Repo-level discovery is a different surface.** For "find me the *package* that contains a tool about X" with server-side `owner:` / `category:` keywords, `gxwf repo-search` is the right command — out of scope for this Mold but a known sibling.
diff --git a/content/molds/discover-shed-tool/refinements/2026-06-17-tool-id-query-normalization.md b/content/molds/discover-shed-tool/refinements/2026-06-17-tool-id-query-normalization.md
new file mode 100644
index 0000000..9880f96
--- /dev/null
+++ b/content/molds/discover-shed-tool/refinements/2026-06-17-tool-id-query-normalization.md
@@ -0,0 +1,42 @@
+---
+mold: discover-shed-tool
+date: 2026-06-17
+intent: Close the discover-query false-negative surfaced by the UC1 MRSA interview-to-galaxy test-pipeline re-run.
+decision: eval-add
+---
+
+## Finding
+
+During a `/test-pipeline interview-to-galaxy` run, the per-step loop's deferred
+`integron_finder` step carried a `_plan_context` candidate of `iuc/integron_finder`.
+`gxwf tool-search integron_finder` (the verbatim tool-id token) returns
+**`No hits`**, while `integron finder` (34.60), `integronfinder` (15.97), and
+`integron` (51.33) all score and resolve to the real `iuc/integron_finder`
+wrapper (changeset `5429646e486d`, 2.0.5+galaxy1). ISEScan masked the class of
+bug because its bare token (`isescan`) happens to score.
+
+A discover-shed-tool run that fed the raw underscored/owner-prefixed token into
+`tool-search` would therefore emit `miss` and trigger the `discover-or-author`
+fall-through to author a wrapper for a tool that **already exists** — the
+expensive wrong branch, and a false `miss` that passes every static check.
+
+## Change applied
+
+- **Procedure §1 (Search):** added a "normalize a tool-id-shaped need before
+  searching" step — strip `owner/` prefix, split `_`/`-` into words, try the
+  human name and the bare significant word; a `miss` is only honest after the
+  variants are tried.
+- **Caveats:** strengthened the "try alternate phrasings" bullet with the
+  concrete id-token-vs-human-name failure mode.
+- **eval.md:** added property *"a miss survives query-variant normalization"* —
+  a `miss` that an obvious name-variant would have turned into a hit is a failure.
+- Re-cast the Claude bundle (verify clean, no drift).
+
+## Open
+
+- The deeper contract question is whether the **template** Mold should write a
+  human-readable tool name into `_plan_context` (not only an id-token guess), so
+  the discover step gets a fair query by construction. Tracked in the sibling
+  refinement on [[advance-galaxy-draft-step]] (the loop that constructs the
+  discover query from `_plan_context`). This entry fixes the discovery side;
+  the upstream-hint side stays open.