Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 17 additions & 14 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,9 +24,12 @@ Every repo-managed skill must include its own `evals/evals.json` file at `skills
- Treat this as a required artifact for every first-party skill in this repo
- Run evals **per skill**, not as one shared repo-level eval file
- Run evals from a temp workspace such as `$env:TEMP/<skill-name>-workspace/`, never from inside this repository
- When creating or modifying a repo-managed skill, run both `with_skill` and `without_skill` comparison executions from that temp workspace before the work is considered complete
- For a brand-new skill, the baseline is `without_skill`; for an existing skill, use either `without_skill` or the previous/original skill version as the baseline, matching the `skill-creator` benchmark flow
- Generate the human-review artifacts too: aggregate the comparison into `benchmark.json` and launch `eval-viewer/generate_review.py` from the installed Anthropic `skill-creator` copy (typically under `~/.agents/skills/skill-creator/` or `~/.claude/skills/skill-creator/`) so the user can inspect `Outputs` and `Benchmark` before sign-off
- Deterministic scaffold/template skills must keep local deterministic validators as well; evals supplement validators, they do not replace them

If you add a new skill or modify an existing repo-managed skill, update that skill's `evals/evals.json` before considering the work complete.
If you add a new skill or modify an existing repo-managed skill, update that skill's `evals/evals.json` before considering the work complete. Do not commit temp workspaces, benchmark outputs, or generated review files into this repository unless the user explicitly asks for checked-in artifacts.

## Git Identity

Expand Down Expand Up @@ -138,19 +141,19 @@ Native input widgets are a **host/runtime feature**, not a guaranteed model capa
- Do not switch interaction styles mid-collection unless the host explicitly upgrades from plain text to native controls
- Favor consistency and low-friction UX over conversational variety during parameter collection

`FORMS.md` defines each field with:
- **type** — `text`, `single-choice`, or `multi-choice`
- **prompt** — the question to ask
- **choices** — options for choice types
- **default** — pre-filled value (mark as Recommended)
- **required** — whether the field is mandatory
Presentation rules (enforced in every `FORMS.md`):
- Ask one field at a time — never bundle multiple questions
- Use selectable choices for `single-choice` and `multi-choice` fields — not free text
- When a default exists, present it first and append "(Recommended)"
- For `text` fields with a computed default, offer the computed value as a selectable choice alongside free text
- After all fields are collected, present a summary and ask for confirmation
`FORMS.md` defines each field with:
- **type** — `text`, `single-choice`, or `multi-choice`
- **prompt** — the question to ask
- **choices** — options for choice types
- **default** — pre-filled value (mark as Recommended)
- **required** — whether the field is mandatory

Presentation rules (enforced in every `FORMS.md`):
- Ask one field at a time — never bundle multiple questions
- Use selectable choices for `single-choice` and `multi-choice` fields — not free text
- When a default exists, present it first and append "(Recommended)"
- For `text` fields with a computed default, offer the computed value as a selectable choice alongside free text
- After all fields are collected, present a summary and ask for confirmation

This applies to all skills that collect user input, not just scaffolding skills.

Expand Down
27 changes: 24 additions & 3 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,31 @@

All notable changes to this project will be documented in this file.

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [Unreleased]

## [0.3.1] - 2026-03-19

This is a patch release introducing three new NuGet-focused skills and runner-agnostic benchmark tooling, with enhanced release automation, comprehensive documentation standardization, and skill refinements.

### Added

- `git-nuget-release-notes` skill that creates or updates per-package `.nuget/{ProjectName}/PackageReleaseNotes.txt` files from git history for .NET repositories, with per-skill evals and extracted package release-notes format reference,
- `git-nuget-readme` skill that writes package-facing NuGet READMEs from actual project metadata, git history, and source-backed capability cues, with per-skill evals and README blueprint reference,
- `skill-creator-agnostic` skill that adds runner-agnostic guardrails on top of Anthropic's skill-creator for creating, modifying, and benchmarking skills across Codex, GitHub Copilot, Opus, and similar agents, with enforced temp-workspace isolation and valid benchmark layout.

### Changed

- Enhanced `git-keep-a-changelog` with explicit release-intent trigger words ("finalize", "ready to release", "rtr", "release") that automatically extract and use versions from branch names, streamlining release finalization without manual version input,
- Tightened `git-visual-commits` commit body repair checks to treat short prose bodies wrapped mid-sentence as verification failures that must be repaired before success is reported, with targeted eval coverage,
- Clarified that unscoped `git bot commit` requests apply to the entire worktree unless the user explicitly narrows scope, with eval coverage ensuring yolo mode still groups the full diff,
- Standardized documentation across AGENTS.md, CONTRIBUTING.md, and README.md to explicitly instruct users to resolve the installed Anthropic skill-creator path (typically under `~/.agents/skills/skill-creator/` or `~/.claude/skills/skill-creator/`) before running benchmark and review tools,
- Updated benchmark layout specification from `eval-N` pattern to `iteration-N/eval-name/{config}/run-N/` for clarity and consistency, with PowerShell resolver logic that probes both skill-creator install locations,
- Normalized line wrapping and formatting across all SKILL.md, FORMS.md, and references/ files for improved readability and consistent presentation, removing extra blank lines and compacting multi-line YAML descriptions,
- Normalized line wrapping in shared asset templates including .github/copilot-instructions.md, asset CHANGELOG.md bootstrap files, and package documentation templates,
- Refreshed README catalog to reflect the current skill set and commit-behavior guidance, with updated benchmark and eval workflow documentation.

## [0.3.0] - 2026-03-17

This is a minor release that introduces two complementary git workflow skills, extracts a shared commit-language reference, and backs the whole skill suite with pull-request validation plus stricter skill metadata checks.
Expand Down Expand Up @@ -62,7 +82,8 @@ This is a minor release that introduces two complementary git workflow skills, e

- Improved scaffold fidelity with hidden `.bot` asset preservation, explicit UTF-8 and BOM handling, and checks aimed at preventing mojibake or incomplete generated output.

[Unreleased]: https://github.com/codebeltnet/agentic/compare/v0.3.0...HEAD
[Unreleased]: https://github.com/codebeltnet/agentic/compare/v0.3.1...HEAD
[0.3.1]: https://github.com/codebeltnet/agentic/compare/v0.3.0...v0.3.1
[0.3.0]: https://github.com/codebeltnet/agentic/compare/v0.2.0...v0.3.0
[0.2.0]: https://github.com/codebeltnet/agentic/compare/v0.1.0...v0.2.0
[0.1.0]: https://github.com/codebeltnet/agentic/compare/7eaf364...v0.1.0
161 changes: 86 additions & 75 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,31 +2,31 @@

Thanks for wanting to add or improve a skill. Here's what to know.

## Skill structure

Each skill lives in its own folder:

```
skills/
<skill-name>/
SKILL.md # Required — the skill content
FORMS.md # Optional — structured input collection for agents
assets/ # Optional — literal templates and static files
scripts/ # Optional — executable helpers
references/ # Optional — deeper docs loaded on demand
evals/ # Required for repo-managed skills — test prompts for validation
evals.json
```

## Local sync

Repo-managed skills should be mirrored across all three locations:

- `skills/<name>/` in this repo
- `~/.claude/skills/<name>/`
- `~/.agents/skills/<name>/`

If you edit a local install copy first, copy the changed files back into the repo and into the other local install so every agent sees the same skill version.
## Skill structure
Each skill lives in its own folder:
```
skills/
<skill-name>/
SKILL.md # Required — the skill content
FORMS.md # Optional — structured input collection for agents
assets/ # Optional — literal templates and static files
scripts/ # Optional — executable helpers
references/ # Optional — deeper docs loaded on demand
evals/ # Required for repo-managed skills — test prompts for validation
evals.json
```
## Local sync
Repo-managed skills should be mirrored across all three locations:
- `skills/<name>/` in this repo
- `~/.claude/skills/<name>/`
- `~/.agents/skills/<name>/`
If you edit a local install copy first, copy the changed files back into the repo and into the other local install so every agent sees the same skill version.

## SKILL.md format

Expand Down Expand Up @@ -64,9 +64,9 @@ The `description` is the most important field — it's how the AI decides to loa
- Specific trigger phrases (e.g. "Use when user says 'commit this' or 'stage changes'")
- What it enforces or prevents

## Adding evals (required for repo-managed skills)

Evals let you verify the skill works and measure improvement over a baseline. Every repo-managed skill in this repository must include `evals/evals.json`:
## Adding evals (required for repo-managed skills)
Evals let you verify the skill works and measure improvement over a baseline. Every repo-managed skill in this repository must include `evals/evals.json`:

```json
{
Expand All @@ -81,50 +81,61 @@ Evals let you verify the skill works and measure improvement over a baseline. Ev
}
```

Aim for 3–5 evals that cover distinct scenarios: happy path, edge cases, and cases where the skill should *not* do something.

Run evals from a temp workspace, not from this repository:

```powershell
$workspace = Join-Path $env:TEMP '<skill-name>-workspace'
```

For scaffold/template skills, keep deterministic validators alongside evals. In this repo, `evals/evals.json` is mandatory, and validators like `scripts/validate-skill-templates.ps1` are additional protection.

## Prefer dynamic defaults

When a skill needs defaults for versions, paths, repository names, or support windows, prefer deriving them from a reliable source instead of baking in values that will drift.

- Good sources: git metadata, repo folder names, environment values, official JSON feeds, vendor docs APIs
- Use hardcoded examples as examples only — not as the real defaulting mechanism — when the value can be computed

## Template validation

Use the repo validation harness before submitting scaffold or template changes:

```powershell
powershell -NoProfile -ExecutionPolicy Bypass -File .\scripts\validate-skill-templates.ps1
```

Run the validator locally first for the fastest feedback loop. GitHub
Actions also runs the same script on pull requests, but CI is
the backstop, not the primary authoring loop.

To compare a change against the initial imported version, run the same harness against a git ref:

```powershell
powershell -NoProfile -ExecutionPolicy Bypass -File .\scripts\validate-skill-templates.ps1 -Ref HEAD
```

## Checklist before submitting

- [ ] `SKILL.md` has valid front matter with `name` and `description`
- [ ] Skill is stack-agnostic (or clearly scoped to a specific tech in the name/description)
- [ ] Examples are generic — no personal emails, usernames, or project-specific identifiers
- [ ] At least one eval in `evals/evals.json`
- [ ] The skill's `evals/evals.json` exists and its `skill_name` matches the folder/frontmatter name
- [ ] `scripts/validate-skill-templates.ps1` passes for the current working tree when changing scaffold or template behavior
- [ ] If CI is enabled for the branch, the GitHub Actions validation job passes too
- [ ] Skill evals are intended to run from `$env:TEMP/<skill-name>-workspace/`, not from inside the repo
- [ ] Changed skill files are synced across `skills/<name>/`, `~/.claude/skills/<name>/`, and `~/.agents/skills/<name>/`
- [ ] Skill added to the table in `README.md`
Aim for 3–5 evals that cover distinct scenarios: happy path, edge cases, and cases where the skill should *not* do something.

Run evals from a temp workspace, not from this repository:

```powershell
$workspace = Join-Path $env:TEMP '<skill-name>-workspace'
```

When creating or modifying a repo-managed skill, the eval workflow must include a paired comparison:

- Resolve the installed Anthropic `skill-creator` path first, usually under `~/.agents/skills/skill-creator/` or `~/.claude/skills/skill-creator/`, then run its benchmark scripts from there
- Run each eval as `with_skill`
- Run the baseline as `without_skill` for new skills
- For an existing skill, use either `without_skill` or the previous/original skill version as the baseline, following the `skill-creator` benchmark model
- Aggregate the results into `benchmark.json`
- Launch `eval-viewer/generate_review.py` from that installed `skill-creator` copy so a human can review both `Outputs` and `Benchmark`

This repo treats that paired `with_skill` / `without_skill` comparison as part of the required devex for skill work. The benchmark artifacts live in the temp workspace; do not commit them to this repository unless the change explicitly calls for checked-in examples.

For scaffold/template skills, keep deterministic validators alongside evals. In this repo, `evals/evals.json` is mandatory, and validators like `scripts/validate-skill-templates.ps1` are additional protection.

## Prefer dynamic defaults

When a skill needs defaults for versions, paths, repository names, or support windows, prefer deriving them from a reliable source instead of baking in values that will drift.

- Good sources: git metadata, repo folder names, environment values, official JSON feeds, vendor docs APIs
- Use hardcoded examples as examples only — not as the real defaulting mechanism — when the value can be computed

## Template validation

Use the repo validation harness before submitting scaffold or template changes:

```powershell
powershell -NoProfile -ExecutionPolicy Bypass -File .\scripts\validate-skill-templates.ps1
```

Run the validator locally first for the fastest feedback loop. GitHub Actions also runs the same script on pull requests, but CI is the backstop, not the primary authoring loop.

To compare a change against the initial imported version, run the same harness against a git ref:

```powershell
powershell -NoProfile -ExecutionPolicy Bypass -File .\scripts\validate-skill-templates.ps1 -Ref HEAD
```

## Checklist before submitting

- [ ] `SKILL.md` has valid front matter with `name` and `description`
- [ ] Skill is stack-agnostic (or clearly scoped to a specific tech in the name/description)
- [ ] Examples are generic — no personal emails, usernames, or project-specific identifiers
- [ ] At least one eval in `evals/evals.json`
- [ ] The skill's `evals/evals.json` exists and its `skill_name` matches the folder/frontmatter name
- [ ] Skill changes were benchmarked from a temp workspace with both `with_skill` and `without_skill` runs
- [ ] `benchmark.json` and `eval-viewer/generate_review.py` from the installed Anthropic `skill-creator` copy were used so a human could compare `Outputs` and `Benchmark`
- [ ] `scripts/validate-skill-templates.ps1` passes for the current working tree when changing scaffold or template behavior
- [ ] If CI is enabled for the branch, the GitHub Actions validation job passes too
- [ ] Skill evals are intended to run from `$env:TEMP/<skill-name>-workspace/`, not from inside the repo
- [ ] Changed skill files are synced across `skills/<name>/`, `~/.claude/skills/<name>/`, and `~/.agents/skills/<name>/`
- [ ] Skill added to the table in `README.md`
Loading
Loading