Skip to content

Add harness engineering skill 🤖🤖🤖#1945

Open
baskduf wants to merge 4 commits into
github:stagedfrom
baskduf:codex/add-harness-engineering-skill
Open

Add harness engineering skill 🤖🤖🤖#1945
baskduf wants to merge 4 commits into
github:stagedfrom
baskduf:codex/add-harness-engineering-skill

Conversation

@baskduf

@baskduf baskduf commented Jun 9, 2026

Copy link
Copy Markdown

Pull Request Checklist

  • I have read and followed the CONTRIBUTING.md guidelines.
  • I have read and followed the Guidance for submissions involving paid services. No paid services are involved.
  • My contribution adds a new instruction, prompt, agent, skill, or workflow file in the correct directory.
  • The file follows the required naming convention.
  • The content is clearly structured and follows the example format.
  • I have tested my instructions, prompt, agent, skill, or workflow with GitHub Copilot.
  • I have run npm start and verified that README.md is up to date.
  • I am targeting the staged branch for this pull request.

Description

Adds a new harness-engineering skill for adopting repository-level guardrails for coding agents.

The skill helps users turn repeated AI coding-agent mistakes into durable repository artifacts: agent instructions, enforceable checks, failure memory, drift checks, adoption reports, and review workflows. It is intentionally prompt-first and repository-specific, so it tells the agent to inspect the target repository before adding harness pieces instead of copying a generic template.

This is distinct from existing AI-readiness or AGENTS.md generation resources because it focuses on recurrence prevention after a concrete failure or repeated review pattern, and on tying every high-risk rule to a test, lint rule, CI gate, drift check, or manual review point.


Type of Contribution

  • New instruction file.
  • New prompt file.
  • New agent file.
  • New plugin.
  • New skill file.
  • New agentic workflow.
  • Update to existing instruction, prompt, agent, plugin, skill, or workflow.
  • Other (please specify):

Additional Notes

Validation run locally:

  • npm ci
  • npm run skill:validate
  • npm run plugin:validate
  • npm start
  • git diff --check
  • Markdown CRLF scan excluding node_modules
  • gh skill install /tmp/awesome-copilot-pr-work harness-engineering --from-local --agent github-copilot --scope project --force in a temporary smoke-test repository
  • copilot -p "Use the harness-engineering skill..." dry-run confirmed GitHub Copilot CLI loaded the skill and summarized the first three workflow steps without editing files

By submitting this pull request, I confirm that my contribution abides by the Code of Conduct and will be licensed under the MIT License.

@baskduf baskduf requested a review from aaronpowell as a code owner June 9, 2026 02:40
@github-actions github-actions Bot added new-submission PR adds at least one new contribution skills PR touches skills skill-check-warning Skill validator reported warnings labels Jun 9, 2026
@github-actions

github-actions Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

🔍 Skill Validator Results

✅ All checks passed

Scope Checked
Skills 1
Agents 0
Total 1
Severity Count
--- ---:
❌ Errors 0
⚠️ Warnings 0
ℹ️ Advisories 0

Summary

Level Finding
ℹ️ Found 1 skill(s)
ℹ️ [harness-engineering] 📊 harness-engineering: 1,605 BPE tokens [chars/4: 1,971] (detailed ✓), 18 sections, 3 code blocks
ℹ️ ✅ All checks passed (1 skill(s))
Full validator output ```text Found 1 skill(s) [harness-engineering] 📊 harness-engineering: 1,605 BPE tokens [chars/4: 1,971] (detailed ✓), 18 sections, 3 code blocks ✅ All checks passed (1 skill(s)) ```

@github-actions github-actions Bot removed the skill-check-warning Skill validator reported warnings label Jun 9, 2026

@aaronpowell aaronpowell left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This reads a lot like reimplementing a memory system, which is already built into Copilot.

How do you see it different, and what behaviours do you expect to change by adding this on top of memory?

@baskduf

baskduf commented Jun 15, 2026

Copy link
Copy Markdown
Author

@aaronpowell Thanks, fair pushback. I agree there is overlap with Copilot Memory. If this were only docs/decisions and docs/failures, I’d also describe it as a repo-local memory system.

The distinction I’m trying to make is that memory is only one part of the harness. Copilot Memory can remember repo facts; the harness makes the important facts repo-owned and operational: versioned in the codebase, reviewable in PRs, tied to checks where practical, and measurable through Doctor/review/effectiveness reports.

A related distinction is human judgment, but I would frame it narrowly. This is not a human-approval system for every agent step. The harness defines where human judgment is required when automation cannot prove safety: structural changes may need ADR review or an explicit skip reason, repeated failures need a named detection/prevention path, and if no automated check is practical the repo records the manual review point. That makes memory auditable and reviewable instead of hidden context.

So the behavior change I expect is not simply “the agent remembers more.” It is that repeated failures should turn into a failure note plus a named detection/prevention path, structural changes should trigger ADR review, agents should run the repo’s documented gates, and maintainers can see when rules, checks, memory, and evidence are disconnected.

The benchmark evidence supports that narrower claim around repo-local workflow/schema preservation and diagnosability. It does not prove that memory-harness is generally more correct than workflow-only guidance, and I would not claim that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

new-submission PR adds at least one new contribution skills PR touches skills

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants