Skip to content

Expose explicit skill invocation visibility for APM-restored skills #39369

@theletterf

Description

@theletterf

Summary

Today, gh-aw makes it possible to verify that skills were installed/restored, but not whether a specific skill was actually invoked by the agent at runtime.

In practice, this makes it hard to answer a basic debugging question for workflows that use APM-imported skills:

Did the agent actually use docs-check-style, or was the output only loosely similar to that skill?

What the docs currently show

The published docs appear to cover:

  • how to import APM dependencies with shared/apm.md,
  • how to audit runs with gh aw audit and gh aw logs, and
  • observability for jobs, tokens, MCP usage, and network behavior.

But I could not find documentation that shows any of the following for skills:

  • a run log entry that a named skill was selected or invoked,
  • an audit field listing activated skills,
  • an OTEL attribute for skill invocation, or
  • a workflow artifact that maps review output back to a specific skill call.

What the runtime currently shows

Using a successful smoke test with:

  • engine: copilot,
  • a reusable workflow compiled with gh-aw v0.79.6,
  • APM-imported public skills from elastic/elastic-docs-skills, and
  • a private caller repo,

we can see all of the following in the logs:

1. APM install/restore is visible

The apm job clearly logs that the skills are resolved and integrated:

[>] Resolving elastic-docs-skills-docs-check-style...
[>] Resolving elastic-docs-skills-flag-jargon-skill...
[>] Resolving elastic-docs-skills-frontmatter-audit...
[>] Resolving elastic-docs-skills-content-type-checker...
...
|-- Skill integrated -> .agents/skills/

The agent job also clearly logs that the APM bundle is restored:

Download APM bundle artifact
Build bundle list
Restore APM packages

2. Inline skill restoration is visible

The runtime also logs inline skill extraction/restoration behavior, for example:

  • No inline skill markers found, skipping
  • restore_inline_skills.sh
  • source directory not found — no inline skills to restore

3. But named skill invocation is not visible

What I could not find was a first-class runtime event like:

  • Invoking skill: docs-check-style
  • Selected skill: docs-content-type-checker
  • Activated skills: [docs-check-style, frontmatter-audit]
  • an audit JSON field showing successful skill calls
  • OTEL span attributes for per-skill invocation

So the current debugging story seems to be:

  1. prove the skill files were present,
  2. inspect the agent output,
  3. infer whether the skill was probably used.

That works, but it leaves an observability gap.

Why this matters

This becomes especially important when debugging:

  • prompt wording that asks the agent to use certain skills,
  • differences between installed skills and actual behavior,
  • regressions across gh-aw / engine versions,
  • cases where output looks plausible but might only be coming from the base prompt,
  • APM packaging vs invocation problems.

Without explicit skill invocation visibility, it is hard to distinguish:

  • “the skill was installed but ignored,” from
  • “the skill was used but its guidance did not materially change output.”

Related observations

There are already related issues around APM skill behavior, for example:

Those issues also rely heavily on indirect evidence such as bundle contents, restored skill directories, and output shape.

Request

Could gh-aw expose explicit skill invocation visibility, ideally in one or more of these places?

Option 1: Agent/run logs

Emit a structured log line when a skill is invoked, for example:

[skills] invoked: docs-check-style
[skills] invoked: docs-frontmatter-audit

Option 2: Audit output

Include a field in gh aw audit --json, for example:

{
  "skill_usage": [
    { "name": "docs-check-style", "status": "invoked" },
    { "name": "docs-frontmatter-audit", "status": "invoked" }
  ]
}

Option 3: OpenTelemetry

Add span events or attributes for skill activation, such as:

  • gh-aw.skill.name
  • gh-aw.skill.source
  • gh-aw.skill.invoked=true

Option 4: Agent artifact metadata

Include a small machine-readable file in the agent artifact listing skills discovered vs invoked.

Minimum clarification requested

If explicit tracing already exists somewhere, could the docs point to it clearly?

If it does not exist, this issue is requesting that capability.

Thanks — the recent improvements around built-in GITHUB_TOKEN auth and APM packaging made this much easier to test, and this feels like the next missing piece in the debugging story.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions