Implementation Plan: Generalize Scope Skill for Non-Code Research#794
Conversation
Trecek
left a comment
There was a problem hiding this comment.
AutoSkillit PR Review — Verdict: changes_requested
| ) | ||
|
|
||
|
|
||
| def test_scope_has_no_hardcoded_metrics_rs() -> None: |
There was a problem hiding this comment.
[warning] cohesion: test_scope_has_no_hardcoded_metrics_rs and test_plan_experiment_has_no_hardcoded_metrics_rs are scattered in test_skill_genericization.py but they validate skill-specific content that is already covered by the contracts layer (test_scope_contracts.py). Hardcoded-reference checks for individual skills belong either in a dedicated per-skill contract test file or in test_scope_contracts.py alongside the other scope structural assertions, not mixed into the genericization test module.
There was a problem hiding this comment.
Investigated — this is intentional. The tests/CLAUDE.md placement convention defines tests/skills/ as covering 'skill contract and compliance tests' and test_skill_genericization.py (line 1 docstring) is explicitly for verifying SKILL.md files contain no project-specific AutoSkillit internals. All existing tests (REQ-GEN-001 through REQ-GEN-004) follow the identical pattern: checking SKILL.md content for forbidden strings. The new tests extend REQ-GEN-005 and belong here. test_scope_contracts.py covers structural section-layout assertions for scope, not cross-cutting forbidden-string regression guards — there is no overlap.
| "scope/SKILL.md hardcodes 'src/metrics.rs'. " | ||
| "Use generic evaluation framework search (REQ-GEN-005)." | ||
| ) | ||
| assert "test_metrics_assess" not in content, ( |
There was a problem hiding this comment.
[info] tests: test_scope_has_no_hardcoded_metrics_rs asserts that 'test_metrics_assess' is absent from scope/SKILL.md, but there is no corresponding assertion for 'test_metrics_assess' in test_plan_experiment_has_no_hardcoded_metrics_rs. If 'test_metrics_assess' is a forbidden hardcoded identifier (REQ-GEN-005), the coverage is inconsistent across the two tests.
| ) | ||
|
|
||
| def test_section_between_technical_context_and_hypotheses(self) -> None: | ||
| def test_section_between_domain_context_and_hypotheses(self) -> None: |
There was a problem hiding this comment.
[info] tests: test_section_between_domain_context_and_hypotheses relies on str.index() which raises ValueError (not AssertionError) if a section heading is missing. A missing '## Domain Context', '## Computational Complexity', or '## Hypotheses' heading would produce an unhandled exception rather than a clear test failure message, making failures harder to diagnose.
| {Which canonical metrics from src/metrics.rs apply to this research question. | ||
| List each metric name, quality dimension (Accuracy/Parity/Performance), and | ||
| current threshold value. Note any gaps where no canonical metric exists.} | ||
| ## Metric Context *(include only when an evaluation framework was found)* |
There was a problem hiding this comment.
[warning] cohesion: '## Metric Context' in scope output template is conditionally emitted (include only when evaluation framework found), but plan-experiment SKILL.md always emits a metrics table and WARNING for NEW metrics. The two skills handle evaluation-framework absence asymmetrically — scope silently omits the section while plan-experiment always renders it. This will cause confusion when both skills are composed in the same workflow.
There was a problem hiding this comment.
Investigated — this is intentional. The asymmetry serves different pipeline stages: scope is an early-discovery step where omitting an empty Metric Context section is correct (commit 502d2d9 explicitly lists 'make Metric Context conditional' as a design goal). Plan-experiment always renders a Dependent Variables table because an experiment must define what it measures — using 'NEW' as the canonical name handles the no-framework case gracefully. Plan-experiment Subagent A explicitly instructs 'Cross-reference against the scope report's Metric Context section if present; if absent, proceed without it and note the gap', demonstrating deliberate awareness of the conditional handoff between the two skills.
| > specific structures, relationships, mechanisms, and processes that are central to | ||
| > the research question. | ||
|
|
||
| **[EVALUATION FRAMEWORK — Metrics or Assessment]** |
There was a problem hiding this comment.
[info] cohesion: Subagent menu label '[EVALUATION FRAMEWORK — Metrics or Assessment]' differs from the output section it populates ('## Metric Context'). All other menu entries map consistently (e.g. '[PRIOR ART …]' → '## Prior Art', '[DOMAIN CONTEXT …]' → '## Domain Context'), so this mismatch breaks traceability symmetry.
Trecek
left a comment
There was a problem hiding this comment.
AutoSkillit review found 2 blocking issues. See inline comments.
Verdict: changes_requested
Actionable (warning, clear fix):
tests/skills/test_skill_genericization.py:72[cohesion] — new hardcoded-reference tests belong in contracts layer, not genericization moduletests/skills/test_skill_genericization.py:74[tests] — duplicate inlineskill_dirpath computation across both new tests (brittle path)
Needs decision (warning, ambiguous intent):
src/autoskillit/skills_extended/scope/SKILL.md:172[cohesion] — scope omits## Metric Contextwhen no framework found, but plan-experiment always renders it; asymmetric behavior needs alignment
Info only:
tests/skills/test_skill_genericization.py:80[tests] — missingtest_metrics_assessassertion in plan-experiment testtests/contracts/test_scope_contracts.py:41[tests] —str.index()raises ValueError not AssertionError on missing headingssrc/autoskillit/skills_extended/scope/SKILL.md:96[cohesion] — EVALUATION FRAMEWORK menu label → Metric Context output section name mismatch
6817f65 to
8950095
Compare
…ode research Replace scope's fixed 5-subagent list with a suggested menu (≥5 required), rename software-centric report sections to domain-agnostic equivalents (Technical Context→Domain Context, Prior Art in Codebase→Prior Art, make Metric Context conditional), remove all src/metrics.rs hardcoding from both scope and plan-experiment skills, and add regression guards to test_skill_genericization.py enforcing REQ-GEN-005. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… inline path duplication Removes the duplicated Path(__file__).parent.parent.parent / 'src/autoskillit/skills_extended' expression in both new tests, replacing it with a module-level SKILLS_EXTENDED_DIR constant following the existing SKILLS_DIR pattern. Addresses reviewer comment #3071115612 (REQ-GEN-005).
8950095 to
58a5cd7
Compare
Summary
The
scopeandplan-experimentskills are software-centric in three ways: (1)scopehas afixed mandatory list of 5 subagents where 4 assume a software codebase, (2) both skills hardcode
src/metrics.rspaths, and (3)scope's report template section names and Known/Unknown matrixrows assume software context. This plan replaces the fixed subagent list with a suggested menu
(agent selects ≥5), removes all
src/metrics.rshardcoding, renames report sections to domain-agnostic language, and makes
Metric Contextconditional. One test contract assertion must beupdated to reflect the section rename.
Architecture Impact
Scenarios Diagram
%%{init: {'flowchart': {'nodeSpacing': 40, 'rankSpacing': 55, 'curve': 'basis'}}}%% flowchart LR %% CLASS DEFINITIONS %% classDef cli fill:#1a237e,stroke:#7986cb,stroke-width:2px,color:#fff; classDef stateNode fill:#004d40,stroke:#4db6ac,stroke-width:2px,color:#fff; classDef handler fill:#e65100,stroke:#ffb74d,stroke-width:2px,color:#fff; classDef phase fill:#6a1b9a,stroke:#ba68c8,stroke-width:2px,color:#fff; classDef output fill:#00695c,stroke:#4db6ac,stroke-width:2px,color:#fff; classDef detector fill:#b71c1c,stroke:#ef5350,stroke-width:2px,color:#fff; classDef terminal fill:#1a237e,stroke:#7986cb,stroke-width:2px,color:#fff; subgraph S1 ["SCENARIO 1: Code Research — Recipe Happy Path"] direction LR S1_RECIPE["research recipe<br/>━━━━━━━━━━<br/>orchestrates all phases<br/>scope → plan → review"] S1_SCOPE["● scope/SKILL.md<br/>━━━━━━━━━━<br/>parse question<br/>fetch GitHub issue<br/>≥5 parallel subagents"] S1_REPORT["scope_report<br/>━━━━━━━━━━<br/>known/unknown matrix<br/>captured as context token"] S1_PLAN["● plan-experiment/SKILL.md<br/>━━━━━━━━━━<br/>feasibility subagents A/B/C<br/>YAML frontmatter V1–V9"] S1_EXPPLAN["experiment_plan<br/>━━━━━━━━━━<br/>research design<br/>+ implementation phases"] S1_REVIEW["review-design<br/>━━━━━━━━━━<br/>verdict: GO / REVISE / STOP"] end subgraph S2 ["SCENARIO 2: Non-Code Domain Research (Generalized)"] direction LR S2_USER["domain question<br/>━━━━━━━━━━<br/>e.g. biology / chemistry<br/>social science"] S2_SCOPE["● scope/SKILL.md<br/>━━━━━━━━━━<br/>generic subagent menu<br/>domain-aware branches"] S2_SUBAGENTS["parallel subagents<br/>━━━━━━━━━━<br/>Prior Art (literature)<br/>Domain Context (structures)<br/>Eval Framework (scales/rubrics)<br/>Data Availability<br/>Complexity"] S2_SYNTH["synthesis<br/>━━━━━━━━━━<br/>scope_report written to<br/>temp/scope/"] S2_PLAN["● plan-experiment/SKILL.md<br/>━━━━━━━━━━<br/>generic measurement feasibility<br/>no hardcoded metrics.rs path"] S2_ARTIFACT["experiment_plan<br/>━━━━━━━━━━<br/>domain-agnostic design<br/>+ data_manifest"] end subgraph S3 ["SCENARIO 3: Design Revision Loop"] direction LR S3_REVIEW["review-design<br/>━━━━━━━━━━<br/>verdict = REVISE"] S3_GUIDANCE["revision_guidance<br/>━━━━━━━━━━<br/>feedback path token<br/>2nd positional arg"] S3_PLAN["● plan-experiment/SKILL.md<br/>━━━━━━━━━━<br/>reads 2nd path token<br/>incorporates feedback<br/>re-runs frontmatter V1–V9"] S3_NEW["revised experiment_plan<br/>━━━━━━━━━━<br/>retries ≤ 2"] end subgraph S4 ["SCENARIO 4: Contract Test Enforcement (CI Gate)"] direction LR S4_PYTEST["task test-check<br/>━━━━━━━━━━<br/>pytest -n4 --dist worksteal"] S4_CONTRACTS["● test_scope_contracts.py<br/>━━━━━━━━━━<br/>Computational Complexity<br/>section present + structured<br/>baseline instruction present"] S4_GENERIC["● test_skill_genericization.py<br/>━━━━━━━━━━<br/>no src/metrics.rs<br/>no test_metrics_assess<br/>no AutoSkillit-internal paths"] S4_SKILL["● scope/SKILL.md<br/>● plan-experiment/SKILL.md<br/>━━━━━━━━━━<br/>read via pkg_root()"] S4_PASS(["CI: PASS / FAIL"]) end %% SCENARIO 1 FLOW %% S1_RECIPE -->|"triggers scope step"| S1_SCOPE S1_SCOPE -->|"writes"| S1_REPORT S1_REPORT -->|"$context.scope_report"| S1_PLAN S1_PLAN -->|"writes"| S1_EXPPLAN S1_EXPPLAN -->|"feeds"| S1_REVIEW %% SCENARIO 2 FLOW %% S2_USER -->|"invokes"| S2_SCOPE S2_SCOPE -->|"launches ≥5"| S2_SUBAGENTS S2_SUBAGENTS -->|"consolidates into"| S2_SYNTH S2_SYNTH -->|"scope_report path"| S2_PLAN S2_PLAN -->|"writes"| S2_ARTIFACT %% SCENARIO 3 FLOW %% S3_REVIEW -->|"emits"| S3_GUIDANCE S3_GUIDANCE -->|"2nd path token"| S3_PLAN S3_PLAN -->|"revised output"| S3_NEW S3_NEW -.->|"re-review (≤2x)"| S3_REVIEW %% SCENARIO 4 FLOW %% S4_PYTEST -->|"runs"| S4_CONTRACTS S4_PYTEST -->|"runs"| S4_GENERIC S4_CONTRACTS -->|"reads via pkg_root()"| S4_SKILL S4_GENERIC -->|"reads via pkg_root()"| S4_SKILL S4_CONTRACTS -->|"asserts"| S4_PASS S4_GENERIC -->|"asserts"| S4_PASS %% CLASS ASSIGNMENTS %% class S1_RECIPE,S2_USER,S4_PYTEST cli; class S1_REPORT,S2_SYNTH,S2_ARTIFACT,S3_GUIDANCE,S3_NEW,S1_EXPPLAN stateNode; class S1_SCOPE,S2_SCOPE,S1_PLAN,S2_PLAN,S3_PLAN,S3_REVIEW,S1_REVIEW handler; class S2_SUBAGENTS phase; class S4_CONTRACTS,S4_GENERIC,S4_SKILL output; class S4_PASS terminal;State Lifecycle Diagram
%%{init: {'flowchart': {'nodeSpacing': 50, 'rankSpacing': 60, 'curve': 'basis'}}}%% flowchart TB %% CLASS DEFINITIONS %% classDef cli fill:#1a237e,stroke:#7986cb,stroke-width:2px,color:#fff; classDef stateNode fill:#004d40,stroke:#4db6ac,stroke-width:2px,color:#fff; classDef handler fill:#e65100,stroke:#ffb74d,stroke-width:2px,color:#fff; classDef phase fill:#6a1b9a,stroke:#ba68c8,stroke-width:2px,color:#fff; classDef output fill:#00695c,stroke:#4db6ac,stroke-width:2px,color:#fff; classDef detector fill:#b71c1c,stroke:#ef5350,stroke-width:2px,color:#fff; classDef gap fill:#ff6f00,stroke:#ffa726,stroke-width:2px,color:#000; classDef terminal fill:#1a237e,stroke:#7986cb,stroke-width:2px,color:#fff; classDef newComponent fill:#2e7d32,stroke:#81c784,stroke-width:2px,color:#fff; START([INVOKE]) subgraph Inputs ["INIT_ONLY — Set Once, Never Modified"] direction LR RQ["● research_question<br/>━━━━━━━━━━<br/>scope: free-text, issue ref,<br/>or domain topic<br/>[INIT_ONLY]"] SRP["scope_report_path<br/>━━━━━━━━━━<br/>plan-experiment: required<br/>path to scope output<br/>[INIT_ONLY]"] end subgraph ResumeTiers ["RESUME DETECTION — INIT_PRESERVE Gate"] direction TB RevGuid["revision_guidance path<br/>━━━━━━━━━━<br/>INIT_PRESERVE: read once,<br/>absent OR non-existent<br/>→ FIRST PASS<br/>present + exists → REVISE"] T1["Tier 1: Explicit<br/>━━━━━━━━━━<br/>revision_guidance present<br/>AND file exists<br/>→ REVISE mode"] T2["Tier 2: Default<br/>━━━━━━━━━━<br/>absent / empty / missing<br/>→ FIRST PASS mode"] end subgraph ScopePhase ["● scope/SKILL.md — Parallel Exploration (≥5 subagents)"] direction TB S1["● Prior Art / Literature<br/>━━━━━━━━━━<br/>codebase or domain survey<br/>[generic: any domain]"] S2["● External Research<br/>━━━━━━━━━━<br/>web search — tools,<br/>methods, papers"] S3["● Domain Context<br/>━━━━━━━━━━<br/>software arch OR domain<br/>structures/mechanisms<br/>[generalized]"] S4["● Evaluation Framework<br/>━━━━━━━━━━<br/>metrics/rubrics/scales;<br/>explicit absent-flag<br/>if none found"] S5["● Computational Complexity<br/>━━━━━━━━━━<br/>dominant op, scaling,<br/>bottlenecks, gotchas<br/>[conditional — skip if N/A]"] end subgraph ScopeOutput ["DERIVED — scope Output Artifact"] SR["scope_report.md<br/>━━━━━━━━━━<br/>Known/Unknown Matrix<br/>Hypotheses, Directions<br/>Metric Context (if found)<br/>[write-once token: scope_report]"] end subgraph MutablePlan ["MUTABLE — Frontmatter Fields (assembled in plan-experiment Step 3)"] direction LR ET["experiment_type<br/>━━━━━━━━━━<br/>benchmark | config_study |<br/>causal_inference |<br/>robustness_audit | exploratory"] EST["estimand<br/>━━━━━━━━━━<br/>treatment, outcome,<br/>population, contrast"] MET["metrics[]<br/>━━━━━━━━━━<br/>name, unit, canonical_name,<br/>collection_method, threshold,<br/>direction, primary"] BAS["baselines[]<br/>━━━━━━━━━━<br/>name, version, tuning_budget<br/>[required: benchmark/causal]"] SP["statistical_plan<br/>━━━━━━━━━━<br/>test, alpha, power_target,<br/>correction_method, MDE<br/>[waived: exploratory]"] ENV["environment<br/>━━━━━━━━━━<br/>type: standard | custom<br/>spec_path (when custom)"] SC["success_criteria<br/>━━━━━━━━━━<br/>conclusive_positive,<br/>conclusive_negative,<br/>inconclusive"] DM["● data_manifest[]<br/>━━━━━━━━━━<br/>hypothesis, source_type,<br/>description, acquisition,<br/>location, verification<br/>[generalized: any domain]"] end subgraph ValidationGates ["VALIDATION GATES — Applied in Order Before Frontmatter Write"] direction TB V1["V1 — baselines required<br/>━━━━━━━━━━<br/>benchmark/causal_inference<br/>→ len(baselines)≥1 + version<br/>[ERROR: abort frontmatter]"] V2["V2 — contrast required<br/>━━━━━━━━━━<br/>causal_inference<br/>→ estimand.contrast not null<br/>[ERROR: abort frontmatter]"] V3["V3 — statistical_plan<br/>━━━━━━━━━━<br/>!exploratory<br/>→ plan present, test not null<br/>[ERROR: abort frontmatter]"] V4["V4 — spec_path required<br/>━━━━━━━━━━<br/>environment.type=custom<br/>→ spec_path not null<br/>[ERROR: abort frontmatter]"] V5["V5 — primary metric<br/>━━━━━━━━━━<br/>len(metrics)≥2<br/>→ exactly one primary:true<br/>[WARNING: YAML comment]"] V6["V6 — NEW metrics<br/>━━━━━━━━━━<br/>any canonical_name=NEW<br/>→ flag unregistered metric<br/>[WARNING: YAML comment]"] V7["V7 — H1 threshold<br/>━━━━━━━━━━<br/>hypothesis_h1<br/>→ must have numeric threshold<br/>[WARNING: YAML comment]"] V8["V8 — criteria→metrics link<br/>━━━━━━━━━━<br/>conclusive_positive<br/>→ references ≥1 metric.name<br/>[WARNING: YAML comment]"] V9["V9 — data_manifest complete<br/>━━━━━━━━━━<br/>every hypothesis has entry;<br/>external has location+depends_on<br/>[ERROR: abort frontmatter]"] end subgraph ErrorAccum ["APPEND_ONLY — Error & Warning Accumulation"] direction LR ERRACC["## Frontmatter Validation Errors<br/>━━━━━━━━━━<br/>V1–V4, V9 ERRORs appended;<br/>NEVER overwritten;<br/>frontmatter OMITTED on any ERROR"] WARNACC["# WARNING: YAML comments<br/>━━━━━━━━━━<br/>V5–V8 inline on field lines;<br/>frontmatter still written;<br/>APPEND_ONLY per field"] end subgraph PlanOutput ["DERIVED — plan-experiment Output Artifact"] direction TB FMOUT["● experiment_plan.md (with frontmatter)<br/>━━━━━━━━━━<br/>YAML frontmatter + prose plan<br/>written to AUTOSKILLIT_TEMP/<br/>plan-experiment/<br/>[write-once token: experiment_plan]"] FMERR["experiment_plan.md (error-only)<br/>━━━━━━━━━━<br/>Prose plan ONLY<br/>+ ## Frontmatter Validation Errors<br/>[on V1–V4 or V9 failure]"] end subgraph ContractTests ["● Contract Tests — Validation Gates on Skill Content"] direction LR CT1["● test_scope_contracts.py<br/>━━━━━━━━━━<br/>Asserts Computational Complexity<br/>section exists with all 4 fields;<br/>baseline computation instruction<br/>present [APPEND_ONLY guard]"] CT2["● test_skill_genericization.py<br/>━━━━━━━━━━<br/>Asserts no src/metrics.rs,<br/>no test_metrics_assess,<br/>no AutoSkillit-specific paths<br/>[INIT_ONLY content guard]"] end %% FLOW %% START --> RQ START --> SRP START --> RevGuid RevGuid --> T1 RevGuid --> T2 RQ --> S1 & S2 & S3 & S4 & S5 S1 & S2 & S3 & S4 & S5 --> SR T1 --> MutablePlan T2 --> MutablePlan SRP --> MutablePlan SR --> MutablePlan ET & EST & MET & BAS & SP & ENV & SC & DM --> V1 V1 --> V2 --> V3 --> V4 --> V9 V4 --> V5 V9 --> V5 V5 --> V6 --> V7 --> V8 V1 & V2 & V3 & V4 & V9 -->|"ERROR"| ERRACC V5 & V6 & V7 & V8 -->|"WARNING"| WARNACC ERRACC --> FMERR WARNACC --> FMOUT V8 -->|"PASS: all gates"| FMOUT SR --> CT1 ET & MET & DM --> CT2 %% CLASS ASSIGNMENTS %% class START terminal; class RQ,SRP detector; class RevGuid gap; class T1,T2 cli; class S1,S2,S3,S4,S5 phase; class SR output; class ET,EST,MET,BAS,SP,ENV,SC,DM stateNode; class DM stateNode; class V1,V2,V3,V4,V9 detector; class V5,V6,V7,V8 gap; class ERRACC,WARNACC handler; class FMOUT,FMERR output; class CT1,CT2 newComponent;Module Dependency Diagram
%%{init: {'flowchart': {'nodeSpacing': 55, 'rankSpacing': 75, 'curve': 'basis'}}}%% graph TB %% CLASS DEFINITIONS %% classDef cli fill:#1a237e,stroke:#7986cb,stroke-width:2px,color:#fff; classDef stateNode fill:#004d40,stroke:#4db6ac,stroke-width:2px,color:#fff; classDef handler fill:#e65100,stroke:#ffb74d,stroke-width:2px,color:#fff; classDef phase fill:#6a1b9a,stroke:#ba68c8,stroke-width:2px,color:#fff; classDef newComponent fill:#2e7d32,stroke:#81c784,stroke-width:2px,color:#fff; classDef output fill:#00695c,stroke:#4db6ac,stroke-width:2px,color:#fff; classDef detector fill:#b71c1c,stroke:#ef5350,stroke-width:2px,color:#fff; classDef gap fill:#ff6f00,stroke:#ffa726,stroke-width:2px,color:#000; classDef integration fill:#c62828,stroke:#ef9a9a,stroke-width:2px,color:#fff; classDef terminal fill:#1a237e,stroke:#7986cb,stroke-width:2px,color:#fff; %% ─── SKILL ASSETS ─────────────────────────────────────────────── %% subgraph SkillAssets ["SKILL ASSETS (skills_extended/)"] direction LR SCOPE["● scope/SKILL.md<br/>━━━━━━━━━━<br/>Research scoping<br/>Generalized: no-code target<br/>+Computational Complexity §<br/>+Known/Unknown matrix"] PLAN["● plan-experiment/SKILL.md<br/>━━━━━━━━━━<br/>Experiment plan generator<br/>Generalized: non-code research<br/>YAML frontmatter extraction<br/>V1-V9 validation rules"] end %% ─── L0: CORE ─────────────────────────────────────────────────── %% subgraph L0 ["L0 — CORE (zero autoskillit imports)"] direction LR CORE["autoskillit.core<br/>━━━━━━━━━━<br/>pkg_root()<br/>io, types, paths<br/>Fan-in: ~104 files"] end %% ─── L1: WORKSPACE ────────────────────────────────────────────── %% subgraph L1 ["L1 — WORKSPACE"] direction LR RESOLVER["workspace/skills.py<br/>━━━━━━━━━━<br/>DefaultSkillResolver<br/>list_all() → scans skills/ + skills_extended/<br/>resolve(name) → SKILL.md path<br/>Source: BUNDLED_EXTENDED"] end %% ─── L2: RECIPE VALIDATION ────────────────────────────────────── %% subgraph L2 ["L2 — RECIPE VALIDATION"] direction LR RSCONTENT["recipe/rules_skill_content.py<br/>━━━━━━━━━━<br/>@semantic_rule validators<br/>no-autoskillit-import-in-skill<br/>output-section-no-markdown-directive<br/>hardcoded-origin-remote"] RSSKILLS["recipe/rules_skills.py<br/>━━━━━━━━━━<br/>@semantic_rule validators<br/>unknown-skill-command<br/>subset-disabled-skill"] RECIPE["recipes/research.yaml<br/>━━━━━━━━━━<br/>run_skill: scope (phase 0)<br/>run_skill: plan-experiment (phase 1)<br/>Chains the two modified skills"] end %% ─── TESTS ────────────────────────────────────────────────────── %% subgraph Tests ["TESTS"] direction LR TCONTRACTS["● test_scope_contracts.py<br/>━━━━━━━━━━<br/>TestComputationalComplexitySection<br/>5 tests: section exists, 4 fields<br/>present, ordering, regex checks<br/>Imports: autoskillit.core.pkg_root"] TGENERIC["● test_skill_genericization.py<br/>━━━━━━━━━━<br/>7 tests: no hardcoded paths<br/>no project-specific metrics<br/>no internal gate references<br/>Imports: pathlib only (stdlib)"] end %% ─── DEPENDENCY EDGES ─────────────────────────────────────────── %% %% Core provides pkg_root to workspace and tests CORE -->|"pkg_root()"| RESOLVER CORE -->|"pkg_root() import"| TCONTRACTS %% Workspace discovers skill assets from filesystem RESOLVER -->|"scans filesystem<br/>BUNDLED_EXTENDED"| SCOPE RESOLVER -->|"scans filesystem<br/>BUNDLED_EXTENDED"| PLAN %% Recipe validation uses workspace (deferred) to resolve skills RSCONTENT -.->|"deferred import<br/>DefaultSkillResolver"| RESOLVER RSSKILLS -.->|"deferred import<br/>DefaultSkillResolver.list_all()"| RESOLVER %% Recipe validation reads skill SKILL.md content (via resolver) RSCONTENT -->|"reads SKILL.md content<br/>to apply rules"| SCOPE RSCONTENT -->|"reads SKILL.md content<br/>to apply rules"| PLAN %% research.yaml references the two skills RECIPE -->|"run_skill: scope"| SCOPE RECIPE -->|"run_skill: plan-experiment"| PLAN %% rules_skills validates recipe skill references RSSKILLS -->|"validates skill names<br/>in recipe steps"| RECIPE %% Tests read SKILL.md files directly (pathlib / pkg_root) TCONTRACTS -->|"reads SKILL.md<br/>via pkg_root()"| SCOPE TGENERIC -->|"reads SKILL.md<br/>via pathlib"| SCOPE TGENERIC -->|"reads SKILL.md<br/>via pathlib"| PLAN %% recipe validation rules import from core RSCONTENT -->|"imports from"| CORE RSSKILLS -->|"imports from"| CORE %% CLASS ASSIGNMENTS %% class CORE stateNode; class RESOLVER handler; class RSCONTENT,RSSKILLS phase; class RECIPE output; class SCOPE,PLAN gap; class TCONTRACTS,TGENERIC detector;Closes #784
Implementation Plan
Plan file:
/home/talon/projects/autoskillit-runs/impl-784-20260412-214908-319157/.autoskillit/temp/make-plan/generalize_scope_skill_plan_2026-04-12_000000.md🤖 Generated with Claude Code via AutoSkillit
Token Usage Summary