Skip to content

feat: improve skill scores for agent-toolkit#20

Open
popey wants to merge 1 commit intosoftaworks:mainfrom
popey:improve/skill-review-optimization
Open

feat: improve skill scores for agent-toolkit#20
popey wants to merge 1 commit intosoftaworks:mainfrom
popey:improve/skill-review-optimization

Conversation

@popey
Copy link

@popey popey commented Mar 11, 2026

Hullo @leonardocouy 👋

I ran your skills through tessl skill review at work and found some targeted improvements. Here's the ten with the most improvements:

score_card

Here's the full before/after in text form:

Skill Before After Change
ship-learn-next 0% 95% +95%
humanizer 0% 89% +89%
domain-name-brainstormer 34% 100% +66%
naming-analyzer 45% 100% +55%
qa-test-planner 50% 100% +50%
design-system-starter 39% 85% +46%
requirements-clarity 63% 96% +33%
agent-md-refactor 76% 100% +24%
gepetto 76% 100% +24%
reducing-entropy 80% 100% +20%
codex 77% 96% +19%
skill-judge 73% 93% +20%
web-to-markdown 81% 100% +19%
dependency-updater 76% 93% +17%
database-schema-designer 76% 93% +17%
c4-architecture 86% 100% +14%
game-changing-features 80% 93% +13%
gemini 76% 89% +13%
professional-communication 81% 93% +12%
react-useeffect 88% 100% +12%
writing-clearly-and-concisely 84% 96% +12%
crafting-effective-readmes 84% 93% +9%
frontend-to-backend-requirements 80% 89% +9%
excalidraw 84% 89% +5%
difficult-workplace-conversations 88% 93% +5%
backend-to-frontend-handoff-docs 84% 86% +2%
Changes made

Validation fixes (0% skills)

  • humanizer / ship-learn-next: Fixed allowed-tools from YAML array to comma-separated string (validation failure blocked all scoring). Also removed unknown version frontmatter key.

Description improvements (most skills)

  • Added explicit "Use when..." clauses with natural trigger terms users would actually say
  • Added specific concrete actions to descriptions (not just generic capability statements)
  • Improved distinctiveness to reduce conflict risk with similar skills

Content improvements

  • Removed verbose explanations of concepts Claude already knows (semantic versioning, database normalization, QA fundamentals, design system philosophy, etc.)
  • Consolidated redundant sections (duplicate warnings, overlapping rules, repeated examples)
  • Added validation checkpoints to workflows where missing
  • Added concrete before/after code examples (react-useeffect, writing-clearly-and-concisely)
  • Restructured bloated skills into concise workflows (domain-name-brainstormer: 213→72 lines, qa-test-planner: 758→278 lines, naming-analyzer: 352→117 lines, skill-judge: 753→261 lines)

Progressive disclosure

  • Extracted heavy reference content to separate files where it improved token efficiency (skill-judge references, daily-meeting-update examples, game-changing-features categories)

Honest disclosure — I work at @tesslio where we build tooling around skills like these. Not a pitch - just saw room for improvement and wanted to contribute.

Want to self-improve your skills? Just point your agent (Claude Code, Codex, etc.) at this Tessl guide and ask it to optimize your skill. Ping me - @popey - if you hit any snags.

Thanks in advance 🙏

Hullo @leonardocouy 👋

I ran your skills through `tessl skill review` at work and found some targeted improvements. Here's the before/after:

| Skill | Before | After | Change |
|-------|--------|-------|--------|
| ship-learn-next | 0% | 95% | +95% |
| humanizer | 0% | 89% | +89% |
| domain-name-brainstormer | 34% | 100% | +66% |
| naming-analyzer | 45% | 100% | +55% |
| qa-test-planner | 50% | 100% | +50% |
| design-system-starter | 39% | 85% | +46% |
| requirements-clarity | 63% | 96% | +33% |
| agent-md-refactor | 76% | 100% | +24% |
| gepetto | 76% | 100% | +24% |
| reducing-entropy | 80% | 100% | +20% |
| codex | 77% | 96% | +19% |
| skill-judge | 73% | 93% | +20% |
| web-to-markdown | 81% | 100% | +19% |
| dependency-updater | 76% | 93% | +17% |
| database-schema-designer | 76% | 93% | +17% |
| c4-architecture | 86% | 100% | +14% |
| game-changing-features | 80% | 93% | +13% |
| gemini | 76% | 89% | +13% |
| professional-communication | 81% | 93% | +12% |
| react-useeffect | 88% | 100% | +12% |
| writing-clearly-and-concisely | 84% | 96% | +12% |
| marp-slide | 89% | 100% | +11% |
| crafting-effective-readmes | 84% | 93% | +9% |
| frontend-to-backend-requirements | 80% | 89% | +9% |
| openapi-to-typescript | 83% | 89% | +6% |
| excalidraw | 84% | 89% | +5% |
| difficult-workplace-conversations | 88% | 93% | +5% |
| daily-meeting-update | 89% | 93% | +4% |
| backend-to-frontend-handoff-docs | 84% | 86% | +2% |

<details>
<summary>Changes made</summary>

**Validation fixes (0% skills)**
- `humanizer` / `ship-learn-next`: Fixed `allowed-tools` from YAML array to comma-separated string (validation failure blocked all scoring). Also removed unknown `version` frontmatter key.

**Description improvements (most skills)**
- Added explicit "Use when..." clauses with natural trigger terms users would actually say
- Added specific concrete actions to descriptions (not just generic capability statements)
- Improved distinctiveness to reduce conflict risk with similar skills

**Content improvements**
- Removed verbose explanations of concepts Claude already knows (semantic versioning, database normalization, QA fundamentals, design system philosophy, etc.)
- Consolidated redundant sections (duplicate warnings, overlapping rules, repeated examples)
- Added validation checkpoints to workflows where missing
- Added concrete before/after code examples (react-useeffect, writing-clearly-and-concisely)
- Restructured bloated skills into concise workflows (domain-name-brainstormer: 213→72 lines, qa-test-planner: 758→278 lines, naming-analyzer: 352→117 lines, skill-judge: 753→261 lines)

**Progressive disclosure**
- Extracted heavy reference content to separate files where it improved token efficiency (skill-judge references, daily-meeting-update examples, game-changing-features categories)

</details>

Honest disclosure — I work at @tesslio where we build tooling around skills like these. Not a pitch - just saw room for improvement and wanted to contribute.

If you want to run reviews, evals and optimizations yourself, just `npm install @tessl/cli` then run `tessl skill review path/to/your/SKILL.md`, and click [here](https://tessl.io/registry/skills/submit) to find out more.

Thanks in advance 🙏
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant