codebeltnet · gimlichael · Mar 19, 2026 · Mar 18, 2026 · Mar 18, 2026 · Mar 18, 2026
diff --git a/AGENTS.md b/AGENTS.md
@@ -24,9 +24,12 @@ Every repo-managed skill must include its own `evals/evals.json` file at `skills
 - Treat this as a required artifact for every first-party skill in this repo
 - Run evals **per skill**, not as one shared repo-level eval file
 - Run evals from a temp workspace such as `$env:TEMP/<skill-name>-workspace/`, never from inside this repository
+- When creating or modifying a repo-managed skill, run both `with_skill` and `without_skill` comparison executions from that temp workspace before the work is considered complete
+- For a brand-new skill, the baseline is `without_skill`; for an existing skill, use either `without_skill` or the previous/original skill version as the baseline, matching the `skill-creator` benchmark flow
+- Generate the human-review artifacts too: aggregate the comparison into `benchmark.json` and launch `eval-viewer/generate_review.py` from the installed Anthropic `skill-creator` copy (typically under `~/.agents/skills/skill-creator/` or `~/.claude/skills/skill-creator/`) so the user can inspect `Outputs` and `Benchmark` before sign-off
 - Deterministic scaffold/template skills must keep local deterministic validators as well; evals supplement validators, they do not replace them
 
-If you add a new skill or modify an existing repo-managed skill, update that skill's `evals/evals.json` before considering the work complete.
+If you add a new skill or modify an existing repo-managed skill, update that skill's `evals/evals.json` before considering the work complete. Do not commit temp workspaces, benchmark outputs, or generated review files into this repository unless the user explicitly asks for checked-in artifacts.
 
 ## Git Identity
 
@@ -138,19 +141,19 @@ Native input widgets are a **host/runtime feature**, not a guaranteed model capa
 - Do not switch interaction styles mid-collection unless the host explicitly upgrades from plain text to native controls
 - Favor consistency and low-friction UX over conversational variety during parameter collection
 
-`FORMS.md` defines each field with:
-- **type** — `text`, `single-choice`, or `multi-choice`
-- **prompt** — the question to ask
-- **choices** — options for choice types
-- **default** — pre-filled value (mark as Recommended)
-- **required** — whether the field is mandatory
-
-Presentation rules (enforced in every `FORMS.md`):
-- Ask one field at a time — never bundle multiple questions
-- Use selectable choices for `single-choice` and `multi-choice` fields — not free text
-- When a default exists, present it first and append "(Recommended)"
-- For `text` fields with a computed default, offer the computed value as a selectable choice alongside free text
-- After all fields are collected, present a summary and ask for confirmation
+`FORMS.md` defines each field with:
+- **type** — `text`, `single-choice`, or `multi-choice`
+- **prompt** — the question to ask
+- **choices** — options for choice types
+- **default** — pre-filled value (mark as Recommended)
+- **required** — whether the field is mandatory
+
+Presentation rules (enforced in every `FORMS.md`):
+- Ask one field at a time — never bundle multiple questions
+- Use selectable choices for `single-choice` and `multi-choice` fields — not free text
+- When a default exists, present it first and append "(Recommended)"
+- For `text` fields with a computed default, offer the computed value as a selectable choice alongside free text
+- After all fields are collected, present a summary and ask for confirmation
 
 This applies to all skills that collect user input, not just scaffolding skills.
 

diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -2,11 +2,31 @@
 
 All notable changes to this project will be documented in this file.
 
-The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
-and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
+The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 
 ## [Unreleased]
 
+## [0.3.1] - 2026-03-19
+
+This is a patch release introducing three new NuGet-focused skills and runner-agnostic benchmark tooling, with enhanced release automation, comprehensive documentation standardization, and skill refinements.
+
+### Added
+
+- `git-nuget-release-notes` skill that creates or updates per-package `.nuget/{ProjectName}/PackageReleaseNotes.txt` files from git history for .NET repositories, with per-skill evals and extracted package release-notes format reference,
+- `git-nuget-readme` skill that writes package-facing NuGet READMEs from actual project metadata, git history, and source-backed capability cues, with per-skill evals and README blueprint reference,
+- `skill-creator-agnostic` skill that adds runner-agnostic guardrails on top of Anthropic's skill-creator for creating, modifying, and benchmarking skills across Codex, GitHub Copilot, Opus, and similar agents, with enforced temp-workspace isolation and valid benchmark layout.
+
+### Changed
+
+- Enhanced `git-keep-a-changelog` with explicit release-intent trigger words ("finalize", "ready to release", "rtr", "release") that automatically extract and use versions from branch names, streamlining release finalization without manual version input,
+- Tightened `git-visual-commits` commit body repair checks to treat short prose bodies wrapped mid-sentence as verification failures that must be repaired before success is reported, with targeted eval coverage,
+- Clarified that unscoped `git bot commit` requests apply to the entire worktree unless the user explicitly narrows scope, with eval coverage ensuring yolo mode still groups the full diff,
+- Standardized documentation across AGENTS.md, CONTRIBUTING.md, and README.md to explicitly instruct users to resolve the installed Anthropic skill-creator path (typically under `~/.agents/skills/skill-creator/` or `~/.claude/skills/skill-creator/`) before running benchmark and review tools,
+- Updated benchmark layout specification from `eval-N` pattern to `iteration-N/eval-name/{config}/run-N/` for clarity and consistency, with PowerShell resolver logic that probes both skill-creator install locations,
+- Normalized line wrapping and formatting across all SKILL.md, FORMS.md, and references/ files for improved readability and consistent presentation, removing extra blank lines and compacting multi-line YAML descriptions,
+- Normalized line wrapping in shared asset templates including .github/copilot-instructions.md, asset CHANGELOG.md bootstrap files, and package documentation templates,
+- Refreshed README catalog to reflect the current skill set and commit-behavior guidance, with updated benchmark and eval workflow documentation.
+
 ## [0.3.0] - 2026-03-17
 
 This is a minor release that introduces two complementary git workflow skills, extracts a shared commit-language reference, and backs the whole skill suite with pull-request validation plus stricter skill metadata checks.
@@ -62,7 +82,8 @@ This is a minor release that introduces two complementary git workflow skills, e
 
 - Improved scaffold fidelity with hidden `.bot` asset preservation, explicit UTF-8 and BOM handling, and checks aimed at preventing mojibake or incomplete generated output.
 
-[Unreleased]: https://github.com/codebeltnet/agentic/compare/v0.3.0...HEAD
+[Unreleased]: https://github.com/codebeltnet/agentic/compare/v0.3.1...HEAD
+[0.3.1]: https://github.com/codebeltnet/agentic/compare/v0.3.0...v0.3.1
 [0.3.0]: https://github.com/codebeltnet/agentic/compare/v0.2.0...v0.3.0
 [0.2.0]: https://github.com/codebeltnet/agentic/compare/v0.1.0...v0.2.0
 [0.1.0]: https://github.com/codebeltnet/agentic/compare/7eaf364...v0.1.0
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -2,31 +2,31 @@
 
 Thanks for wanting to add or improve a skill. Here's what to know.
 
-## Skill structure
-
-Each skill lives in its own folder:
-
-```
-skills/
-  <skill-name>/
-    SKILL.md          # Required — the skill content
-    FORMS.md          # Optional — structured input collection for agents
-    assets/           # Optional — literal templates and static files
-    scripts/          # Optional — executable helpers
-    references/       # Optional — deeper docs loaded on demand
-    evals/            # Required for repo-managed skills — test prompts for validation
-      evals.json
-```
-
-## Local sync
-
-Repo-managed skills should be mirrored across all three locations:
-
-- `skills/<name>/` in this repo
-- `~/.claude/skills/<name>/`
-- `~/.agents/skills/<name>/`
-
-If you edit a local install copy first, copy the changed files back into the repo and into the other local install so every agent sees the same skill version.
+## Skill structure
+
+Each skill lives in its own folder:
+
+```
+skills/
+  <skill-name>/
+    SKILL.md          # Required — the skill content
+    FORMS.md          # Optional — structured input collection for agents
+    assets/           # Optional — literal templates and static files
+    scripts/          # Optional — executable helpers
+    references/       # Optional — deeper docs loaded on demand
+    evals/            # Required for repo-managed skills — test prompts for validation
+      evals.json
+```
+
+## Local sync
+
+Repo-managed skills should be mirrored across all three locations:
+
+- `skills/<name>/` in this repo
+- `~/.claude/skills/<name>/`
+- `~/.agents/skills/<name>/`
+
+If you edit a local install copy first, copy the changed files back into the repo and into the other local install so every agent sees the same skill version.
 
 ## SKILL.md format
 
@@ -64,9 +64,9 @@ The `description` is the most important field — it's how the AI decides to loa
 - Specific trigger phrases (e.g. "Use when user says 'commit this' or 'stage changes'")
 - What it enforces or prevents
 
-## Adding evals (required for repo-managed skills)
-
-Evals let you verify the skill works and measure improvement over a baseline. Every repo-managed skill in this repository must include `evals/evals.json`:
+## Adding evals (required for repo-managed skills)
+
+Evals let you verify the skill works and measure improvement over a baseline. Every repo-managed skill in this repository must include `evals/evals.json`:
 
 ```json
 {
@@ -81,50 +81,61 @@ Evals let you verify the skill works and measure improvement over a baseline. Ev
 }
 ```
 
-Aim for 3–5 evals that cover distinct scenarios: happy path, edge cases, and cases where the skill should *not* do something.
-
-Run evals from a temp workspace, not from this repository:
-
-```powershell
-$workspace = Join-Path $env:TEMP '<skill-name>-workspace'
-```
-
-For scaffold/template skills, keep deterministic validators alongside evals. In this repo, `evals/evals.json` is mandatory, and validators like `scripts/validate-skill-templates.ps1` are additional protection.
-
-## Prefer dynamic defaults
-
-When a skill needs defaults for versions, paths, repository names, or support windows, prefer deriving them from a reliable source instead of baking in values that will drift.
-
-- Good sources: git metadata, repo folder names, environment values, official JSON feeds, vendor docs APIs
-- Use hardcoded examples as examples only — not as the real defaulting mechanism — when the value can be computed
-
-## Template validation
-
-Use the repo validation harness before submitting scaffold or template changes:
-
-```powershell
-powershell -NoProfile -ExecutionPolicy Bypass -File .\scripts\validate-skill-templates.ps1
-```
-
-Run the validator locally first for the fastest feedback loop. GitHub
-Actions also runs the same script on pull requests, but CI is
-the backstop, not the primary authoring loop.
-
-To compare a change against the initial imported version, run the same harness against a git ref:
-
-```powershell
-powershell -NoProfile -ExecutionPolicy Bypass -File .\scripts\validate-skill-templates.ps1 -Ref HEAD
-```
-
-## Checklist before submitting
-
-- [ ] `SKILL.md` has valid front matter with `name` and `description`
-- [ ] Skill is stack-agnostic (or clearly scoped to a specific tech in the name/description)
-- [ ] Examples are generic — no personal emails, usernames, or project-specific identifiers
-- [ ] At least one eval in `evals/evals.json`
-- [ ] The skill's `evals/evals.json` exists and its `skill_name` matches the folder/frontmatter name
-- [ ] `scripts/validate-skill-templates.ps1` passes for the current working tree when changing scaffold or template behavior
-- [ ] If CI is enabled for the branch, the GitHub Actions validation job passes too
-- [ ] Skill evals are intended to run from `$env:TEMP/<skill-name>-workspace/`, not from inside the repo
-- [ ] Changed skill files are synced across `skills/<name>/`, `~/.claude/skills/<name>/`, and `~/.agents/skills/<name>/`
-- [ ] Skill added to the table in `README.md`
+Aim for 3–5 evals that cover distinct scenarios: happy path, edge cases, and cases where the skill should *not* do something.
+
+Run evals from a temp workspace, not from this repository:
+
+```powershell
+$workspace = Join-Path $env:TEMP '<skill-name>-workspace'
+```
+
+When creating or modifying a repo-managed skill, the eval workflow must include a paired comparison:
+
+- Resolve the installed Anthropic `skill-creator` path first, usually under `~/.agents/skills/skill-creator/` or `~/.claude/skills/skill-creator/`, then run its benchmark scripts from there
+- Run each eval as `with_skill`
+- Run the baseline as `without_skill` for new skills
+- For an existing skill, use either `without_skill` or the previous/original skill version as the baseline, following the `skill-creator` benchmark model
+- Aggregate the results into `benchmark.json`
+- Launch `eval-viewer/generate_review.py` from that installed `skill-creator` copy so a human can review both `Outputs` and `Benchmark`
+
+This repo treats that paired `with_skill` / `without_skill` comparison as part of the required devex for skill work. The benchmark artifacts live in the temp workspace; do not commit them to this repository unless the change explicitly calls for checked-in examples.
+
+For scaffold/template skills, keep deterministic validators alongside evals. In this repo, `evals/evals.json` is mandatory, and validators like `scripts/validate-skill-templates.ps1` are additional protection.
+
+## Prefer dynamic defaults
+
+When a skill needs defaults for versions, paths, repository names, or support windows, prefer deriving them from a reliable source instead of baking in values that will drift.
+
+- Good sources: git metadata, repo folder names, environment values, official JSON feeds, vendor docs APIs
+- Use hardcoded examples as examples only — not as the real defaulting mechanism — when the value can be computed
+
+## Template validation
+
+Use the repo validation harness before submitting scaffold or template changes:
+
+```powershell
+powershell -NoProfile -ExecutionPolicy Bypass -File .\scripts\validate-skill-templates.ps1
+```
+
+Run the validator locally first for the fastest feedback loop. GitHub Actions also runs the same script on pull requests, but CI is the backstop, not the primary authoring loop.
+
+To compare a change against the initial imported version, run the same harness against a git ref:
+
+```powershell
+powershell -NoProfile -ExecutionPolicy Bypass -File .\scripts\validate-skill-templates.ps1 -Ref HEAD
+```
+
+## Checklist before submitting
+
+- [ ] `SKILL.md` has valid front matter with `name` and `description`
+- [ ] Skill is stack-agnostic (or clearly scoped to a specific tech in the name/description)
+- [ ] Examples are generic — no personal emails, usernames, or project-specific identifiers
+- [ ] At least one eval in `evals/evals.json`
+- [ ] The skill's `evals/evals.json` exists and its `skill_name` matches the folder/frontmatter name
+- [ ] Skill changes were benchmarked from a temp workspace with both `with_skill` and `without_skill` runs
+- [ ] `benchmark.json` and `eval-viewer/generate_review.py` from the installed Anthropic `skill-creator` copy were used so a human could compare `Outputs` and `Benchmark`
+- [ ] `scripts/validate-skill-templates.ps1` passes for the current working tree when changing scaffold or template behavior
+- [ ] If CI is enabled for the branch, the GitHub Actions validation job passes too
+- [ ] Skill evals are intended to run from `$env:TEMP/<skill-name>-workspace/`, not from inside the repo
+- [ ] Changed skill files are synced across `skills/<name>/`, `~/.claude/skills/<name>/`, and `~/.agents/skills/<name>/`
+- [ ] Skill added to the table in `README.md`