Skip to content

fix(shared/apm): re-shape single-artifact download to expected layout#1051

Merged
danielmeppiel merged 2 commits intomainfrom
fix/shared-apm-single-artifact-flatten
Apr 29, 2026
Merged

fix(shared/apm): re-shape single-artifact download to expected layout#1051
danielmeppiel merged 2 commits intomainfrom
fix/shared-apm-single-artifact-flatten

Conversation

@danielmeppiel
Copy link
Copy Markdown
Collaborator

TL;DR

actions/download-artifact (since v5) silently overrides merge-multiple: false and flattens contents directly into path/ whenever exactly one artifact matches the pattern. Our validation step in shared/apm.md assumed the per-artifact subdir always exists, so any single-group consumer (the common case: one credential group, one packed bundle) fails with missing APM bundles (group did not pack successfully): apm-<id> even though the bundle downloaded and verified successfully.

Repro: https://github.com/microsoft/apm/actions/runs/25124365062/job/73633628506?pr=1039

Why latent until now

#1033 fixed an earlier failure in apm-prep (Go-slice → JSON serialisation for jq --argjson). Before that fix, the agent job never reached the validation step. With prep working, every single-group PR run of pr-review-panel and triage-panel now hits this latent bug.

Fix

Add a normalisation step before validation: when the matrix has exactly one group, re-create the expected ${prefix}apm-<id>/ subdir and move the flattened bundle file(s) into it. No-op for multi-group matrices (download-artifact already produces per-artifact subdirs in that case). Validation script unchanged.

Reference

actions/download-artifact v8.0.1 source:

path:
  isSingleArtifactDownload ||
  inputs.mergeMultiple ||
  artifacts.length === 1   // <-- triggers the flatten
    ? resolvedPath
    : path.join(resolvedPath, artifact.name),

Files

  • shared/apm.md — new "Normalise bundle layout" step
  • pr-review-panel.lock.yml, triage-panel.lock.yml — recompiled
  • actions-lock.json — refreshed by gh aw compile

actions/download-artifact (>=v5) silently overrides 'merge-multiple: false'
and flattens contents directly into 'path/' whenever exactly one artifact
matches the pattern. Our 'Validate downloaded bundles match matrix manifest'
step assumed the per-artifact subdir always exists, so any single-group
consumer (the common case: one credential group, one packed bundle) hits
'missing APM bundles (group did not pack successfully): apm-<id>' even
though the bundle downloaded and verified successfully.

This was latent until #1033 unblocked apm-prep -- previously apm-prep
itself was failing on the malformed jq input, so the agent job never got
far enough to run validation. With prep fixed, every single-group PR run
of pr-review-panel and triage-panel now fails at the validation step.

Fix: add a normalisation step that, when matrix has exactly one group,
re-creates the expected '${prefix}apm-<id>/' subdir and moves the
flattened bundle file(s) into it. No-op for multi-group matrices where
download-artifact already produces per-artifact subdirs. Validation
script is unchanged.

Refs: actions/download-artifact src/download-artifact.ts -- the
'isSingleArtifactDownload || mergeMultiple || artifacts.length === 1'
check that triggers the flatten.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 29, 2026 18:17
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

This PR fixes a workflow failure caused by actions/download-artifact flattening the download directory when exactly one artifact matches the pattern, by normalizing the bundle layout back into the per-artifact subdirectory expected by downstream validation.

Changes:

  • Add a “Normalise bundle layout” step that re-creates the expected ${prefix}apm-<id>/ directory when the matrix contains exactly one group.
  • Recompile/refresh the locked workflow YAMLs and actions lockfile to reflect the shared workflow update.
Show a summary per file
File Description
.github/workflows/shared/apm.md Adds a normalization step to restore expected per-group directory layout for single-artifact downloads.
.github/workflows/pr-review-panel.lock.yml Propagates the normalization step into the compiled/locked workflow.
.github/workflows/triage-panel.lock.yml Propagates the normalization step into the compiled/locked workflow.
.github/aw/actions-lock.json Refreshes the action lock mappings as part of gh aw compile output.

Copilot's findings

  • Files reviewed: 4/4 changed files
  • Comments generated: 3

group_dir="/tmp/gh-aw/apm-bundles/${ARTIFACT_PREFIX}apm-${group_id}"
if [ ! -d "$group_dir" ]; then
mkdir -p "$group_dir"
find /tmp/gh-aw/apm-bundles -mindepth 1 -maxdepth 1 ! -path "$group_dir" -exec mv {} "$group_dir/" \;
Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mv can treat source paths beginning with - as options (e.g., a file named --target-directory=...), which can break the step (and in some contexts be abused). Use mv -- so all artifact-derived paths are treated as operands (e.g., -exec mv -- '{}' "$group_dir/" \;). This should be applied in the shared workflow so the compiled lock files inherit the fix.

Suggested change
find /tmp/gh-aw/apm-bundles -mindepth 1 -maxdepth 1 ! -path "$group_dir" -exec mv {} "$group_dir/" \;
find /tmp/gh-aw/apm-bundles -mindepth 1 -maxdepth 1 ! -path "$group_dir" -exec mv -- {} "$group_dir/" \;

Copilot uses AI. Check for mistakes.
Comment thread .github/workflows/triage-panel.lock.yml Outdated
ARTIFACT_PREFIX: ${{ needs.activation.outputs.artifact_prefix }}
EXPECTED_MATRIX: ${{ needs.apm-prep.outputs.matrix }}
name: Normalise bundle layout (single-artifact flatten workaround)
run: "set -euo pipefail\n# actions/download-artifact (>=v5) flattens contents directly into `path/`\n# whenever exactly one artifact matches the pattern, ignoring\n# `merge-multiple: false`. Re-shape into the per-group subdir layout so\n# downstream validation sees a stable structure regardless of matrix size.\nexpected_count=$(echo \"$EXPECTED_MATRIX\" | jq '.group | length')\nif [ \"$expected_count\" -eq 1 ]; then\n group_id=$(echo \"$EXPECTED_MATRIX\" | jq -r '.group[0].id')\n group_dir=\"/tmp/gh-aw/apm-bundles/${ARTIFACT_PREFIX}apm-${group_id}\"\n if [ ! -d \"$group_dir\" ]; then\n mkdir -p \"$group_dir\"\n find /tmp/gh-aw/apm-bundles -mindepth 1 -maxdepth 1 ! -path \"$group_dir\" -exec mv {} \"$group_dir/\" \\;\n fi\nfi\n"
Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This workflow executes mv on artifact-derived filenames; if any entry begins with -, mv may interpret it as an option. Update the find ... -exec mv ... invocation to include mv -- so operands are unambiguous.

Suggested change
run: "set -euo pipefail\n# actions/download-artifact (>=v5) flattens contents directly into `path/`\n# whenever exactly one artifact matches the pattern, ignoring\n# `merge-multiple: false`. Re-shape into the per-group subdir layout so\n# downstream validation sees a stable structure regardless of matrix size.\nexpected_count=$(echo \"$EXPECTED_MATRIX\" | jq '.group | length')\nif [ \"$expected_count\" -eq 1 ]; then\n group_id=$(echo \"$EXPECTED_MATRIX\" | jq -r '.group[0].id')\n group_dir=\"/tmp/gh-aw/apm-bundles/${ARTIFACT_PREFIX}apm-${group_id}\"\n if [ ! -d \"$group_dir\" ]; then\n mkdir -p \"$group_dir\"\n find /tmp/gh-aw/apm-bundles -mindepth 1 -maxdepth 1 ! -path \"$group_dir\" -exec mv {} \"$group_dir/\" \\;\n fi\nfi\n"
run: "set -euo pipefail\n# actions/download-artifact (>=v5) flattens contents directly into `path/`\n# whenever exactly one artifact matches the pattern, ignoring\n# `merge-multiple: false`. Re-shape into the per-group subdir layout so\n# downstream validation sees a stable structure regardless of matrix size.\nexpected_count=$(echo \"$EXPECTED_MATRIX\" | jq '.group | length')\nif [ \"$expected_count\" -eq 1 ]; then\n group_id=$(echo \"$EXPECTED_MATRIX\" | jq -r '.group[0].id')\n group_dir=\"/tmp/gh-aw/apm-bundles/${ARTIFACT_PREFIX}apm-${group_id}\"\n if [ ! -d \"$group_dir\" ]; then\n mkdir -p \"$group_dir\"\n find /tmp/gh-aw/apm-bundles -mindepth 1 -maxdepth 1 ! -path \"$group_dir\" -exec mv -- {} \"$group_dir/\" \\;\n fi\nfi\n"

Copilot uses AI. Check for mistakes.
ARTIFACT_PREFIX: ${{ needs.activation.outputs.artifact_prefix }}
EXPECTED_MATRIX: ${{ needs.apm-prep.outputs.matrix }}
name: Normalise bundle layout (single-artifact flatten workaround)
run: "set -euo pipefail\n# actions/download-artifact (>=v5) flattens contents directly into `path/`\n# whenever exactly one artifact matches the pattern, ignoring\n# `merge-multiple: false`. Re-shape into the per-group subdir layout so\n# downstream validation sees a stable structure regardless of matrix size.\nexpected_count=$(echo \"$EXPECTED_MATRIX\" | jq '.group | length')\nif [ \"$expected_count\" -eq 1 ]; then\n group_id=$(echo \"$EXPECTED_MATRIX\" | jq -r '.group[0].id')\n group_dir=\"/tmp/gh-aw/apm-bundles/${ARTIFACT_PREFIX}apm-${group_id}\"\n if [ ! -d \"$group_dir\" ]; then\n mkdir -p \"$group_dir\"\n find /tmp/gh-aw/apm-bundles -mindepth 1 -maxdepth 1 ! -path \"$group_dir\" -exec mv {} \"$group_dir/\" \\;\n fi\nfi\n"
Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as in triage-panel.lock.yml: add -- to the mv invocation so filenames starting with - cannot be misinterpreted as flags, preventing avoidable job failures on edge-case artifact contents.

Suggested change
run: "set -euo pipefail\n# actions/download-artifact (>=v5) flattens contents directly into `path/`\n# whenever exactly one artifact matches the pattern, ignoring\n# `merge-multiple: false`. Re-shape into the per-group subdir layout so\n# downstream validation sees a stable structure regardless of matrix size.\nexpected_count=$(echo \"$EXPECTED_MATRIX\" | jq '.group | length')\nif [ \"$expected_count\" -eq 1 ]; then\n group_id=$(echo \"$EXPECTED_MATRIX\" | jq -r '.group[0].id')\n group_dir=\"/tmp/gh-aw/apm-bundles/${ARTIFACT_PREFIX}apm-${group_id}\"\n if [ ! -d \"$group_dir\" ]; then\n mkdir -p \"$group_dir\"\n find /tmp/gh-aw/apm-bundles -mindepth 1 -maxdepth 1 ! -path \"$group_dir\" -exec mv {} \"$group_dir/\" \\;\n fi\nfi\n"
run: "set -euo pipefail\n# actions/download-artifact (>=v5) flattens contents directly into `path/`\n# whenever exactly one artifact matches the pattern, ignoring\n# `merge-multiple: false`. Re-shape into the per-group subdir layout so\n# downstream validation sees a stable structure regardless of matrix size.\nexpected_count=$(echo \"$EXPECTED_MATRIX\" | jq '.group | length')\nif [ \"$expected_count\" -eq 1 ]; then\n group_id=$(echo \"$EXPECTED_MATRIX\" | jq -r '.group[0].id')\n group_dir=\"/tmp/gh-aw/apm-bundles/${ARTIFACT_PREFIX}apm-${group_id}\"\n if [ ! -d \"$group_dir\" ]; then\n mkdir -p \"$group_dir\"\n find /tmp/gh-aw/apm-bundles -mindepth 1 -maxdepth 1 ! -path \"$group_dir\" -exec mv -- {} \"$group_dir/\" \\;\n fi\nfi\n"

Copilot uses AI. Check for mistakes.
@danielmeppiel danielmeppiel added the panel-review Trigger the apm-review-panel gh-aw workflow label Apr 29, 2026
@github-actions
Copy link
Copy Markdown

APM Review Panel Verdict: REJECT

Supply-chain finding on the actions-lock.json removal blocks merge; the normalisation fix itself is mechanically correct and all other panelists are satisfied.

Required before merge (1 item)

  • [supply-chain-security-expert] setup-cli@v0.71.2 removed from actions-lock.json but remains in active use in agentics-maintenance.yml at .github/aw/actions-lock.json
    • Why: actions-lock.json appears to be the canonical registry of approved, SHA-pinned third-party actions. Removing github/gh-aw-actions/setup-cli@v0.71.2 while agentics-maintenance.yml still calls that action at the same SHA (ab8940d1df237fc4fae4ab8091e7ba66ada55a55) breaks the invariant that every consumed action must appear in the lock. If the lock is enforced by a CI gate or audit tool, agentics-maintenance.yml now runs an action that is outside the approved set.
    • Suggested fix: Restore the removed entry to actions-lock.json, or confirm via documented evidence that agentics-maintenance.yml is explicitly excluded from lock enforcement scope.

Nits (11 items, skip if you want)

  • [supply-chain-security-expert] group_id flows into a shell path without sanitization against traversal sequences
  • [python-architect] ! -path exclusion could be replaced with -prune for clarity and correctness under edge cases
  • [python-architect] jq expression .group | length is not guarded against a null or missing group key -- use .group // [] | length
  • [python-architect] Hardcoded /tmp/gh-aw/apm-bundles base path appears in three places in the normalise step; extract to a local variable
  • [cli-logging-expert] No echo statements trace the normalisation path taken, making CI logs silent on success and ambiguous on skip
  • [cli-logging-expert] Silent fallthrough when group_dir already exists hides the case where a previous run partially succeeded
  • [cli-logging-expert] find+mv produces no output; use mv -v or find ... -print so the moved file list appears in the log
  • [devx-ux-expert] Lock files inline the workaround as a single escaped string, losing all inline comments present in apm.md
  • [devx-ux-expert] No link to the upstream actions/download-artifact issue in the in-code comment -- future contributors cannot verify when the workaround can be removed
  • [devx-ux-expert] Step name uses British spelling Normalise -- consider Normalize if the codebase convention is American English
  • [oss-growth-hacker] CHANGELOG entry missing for this silent false-failure fix -- contributors who hit the error and gave up will not know it is resolved

CEO arbitration

The panel is unanimous that this PR addresses a real and disruptive CI regression: actions/download-artifact v5+ silently flattens single-artifact downloads, breaking the expected per-group directory layout for downstream validation steps. The normalisation fix is mechanically correct and the procedural shell approach is sound for a sequential CI runner. All five active panelists found no architectural objections and the non-security nits are minor quality improvements -- better tracing in CI logs, a link to the upstream issue, CHANGELOG hygiene -- none of which block correctness.

The single blocking finding comes from Supply Chain Security: the PR removes the github/gh-aw-actions/setup-cli@v0.71.2 entry from actions-lock.json while agentics-maintenance.yml continues to invoke that action. If actions-lock.json is the enforced registry of SHA-pinned, approved third-party actions -- which its name and structure imply -- this creates an out-of-lock action running in production CI. The fix is straightforward: restore the removed entry, or produce documented evidence that agentics-maintenance.yml is explicitly scoped out of lock enforcement. Until that invariant is restored, the PR cannot land without undermining the supply-chain control surface the lock was introduced to protect.

All remaining nits are non-blocking. The path-sanitization note on group_id (guarding against traversal sequences before shell interpolation) is the highest-value nit and worth folding in before or alongside the required fix, but it does not independently block the PR.

Dissent resolved: No inter-panelist disagreements. All active panelists agreed on the severity classification of every finding. Supply Chain Security's required finding was not contested by any other panelist.

Growth/positioning note: The OSS Growth Hacker correctly identifies that silent single-group panel failures hit first-time contributors hardest -- the exact cohort whose retention matters most. A contributor who opens their first PR, sees "missing APM bundles", and gets no explanation will assume they broke something and disengage. A one-sentence reliability note in the next release post or CHANGELOG entry closes that loop and signals to the community that the project actively tracks and fixes contributor-path failures. Low effort, high trust signal.


Per-persona findings (full)

Python Architect

No mermaid diagrams applicable -- purely procedural shell script with no class or module boundaries. Flow: read EXPECTED_MATRIX -> count groups via jq -> if count==1 and group_dir absent, mkdir + find/mv. Guard is correct; TOCTOU not a concern in a sequential CI runner.

Required: None.

Nits:

  • ! -path exclusion could be replaced with -prune for clarity and correctness under edge cases. find's -path predicate does substring matching on the full path string; -prune is the idiomatic and unambiguous way to exclude a directory from find traversal.
  • jq expression .group | length is not guarded against a null or missing group key. If EXPECTED_MATRIX is missing the group key (e.g., schema drift), jq returns null and the -eq 1 numeric test evaluates to false rather than failing fast. Use .group // [] | length.
  • Hardcoded /tmp/gh-aw/apm-bundles base path appears in three places across the normalise step. Extracting to a local variable (bundle_root=/tmp/gh-aw/apm-bundles) reduces coupling.

CLI Logging Expert

Required: None.

Nits:

  • No echo statements trace the normalisation path taken, making CI logs silent on success and ambiguous on skip. When expected_count != 1 the step exits with zero output; adding echo "[normalise] expected_count=$expected_count; workaround skipped" and echo "[normalise] reshaping into $group_dir" would make the log self-describing at zero cost.
  • Silent fallthrough when group_dir already exists hides the case where a previous run partially succeeded. An echo like echo "[normalise] $group_dir already present, skipping reshape" would distinguish a correct two-artifact run from a re-run of a partial failure without adding noise.
  • find+mv produces no output; errors from mv are swallowed into the pipefail net but the moved file list is never logged. If mv fails on a specific file, the log will only show the mv error with no context. Use mv -v or add find ... -print.

DevX UX Expert

Required: None.

Nits:

  • Lock files (pr-review-panel.lock.yml, triage-panel.lock.yml) inline the workaround as a single escaped string, losing all inline comments. The explanatory comments present in apm.md are stripped, so a contributor debugging a failed CI step in those files sees opaque shell with no breadcrumb to the upstream bug.
  • No link to the upstream actions/download-artifact issue in the in-code comment. Future contributors cannot verify whether the bug has been fixed upstream and whether the workaround can be removed, increasing indefinite maintenance drag.
  • Step name uses British spelling Normalise -- minor consistency nit if surrounding step names use American English.

Supply Chain Security Expert

Required:

  • setup-cli@v0.71.2 removed from actions-lock.json but remains in active use in agentics-maintenance.yml at .github/aw/actions-lock.json. Breaking the lock-registry invariant means agentics-maintenance.yml now runs an action outside the approved set. Fix: restore the entry or document the exemption.

Nits:

  • group_id and ARTIFACT_PREFIX flow into a shell path without sanitization against traversal sequences. group_id is extracted from EXPECTED_MATRIX via jq and interpolated directly into group_dir without stripping ../ or validating it is a safe single path segment. Add a guard: [[ "$group_id" =~ ^[a-zA-Z0-9_-]+$ ]] || { echo '::error::unsafe group_id'; exit 1; } before constructing group_dir.

Auth Expert

Inactive -- PR modifies only CI/CD workflow YAML files (.github/workflows/shared/apm.md, pr-review-panel.lock.yml, triage-panel.lock.yml) and actions-lock.json; no auth-sensitive files (auth.py, token_manager.py, github_downloader.py, etc.) were touched.

OSS Growth Hacker

Required: None.

Nits:

  • CHANGELOG entry missing for this silent false-failure fix. Contributors who hit the "missing APM bundles" error and gave up will not know it is fixed unless it appears in release notes. A one-liner in CHANGELOG under a Fixed section converts frustrated drive-bys into returning contributors.

Side-channel note (forwarded to CEO): This bug created a silent false failure for any single-group consumer of the panel/triage workflows -- exactly the path a first-time contributor walks when opening a PR. A contributor who sees "missing APM bundles" on their first PR will assume they broke something and disengage. Worth a sentence in the next release post under reliability wins.

Verdict computed deterministically: 1 required finding across 5 active panelists. APPROVE iff N == 0. Push a new commit to clear this verdict label automatically.

Generated by PR Review Panel · ● 1.1M ·

@github-actions github-actions Bot added panel-rejected Apm-review-panel verdict: REJECT. Removed automatically on next push. and removed panel-review Trigger the apm-review-panel gh-aw workflow labels Apr 29, 2026
…changelog

- Restore github/gh-aw-actions/setup-cli@v0.71.2 entry in actions-lock.json
  (still referenced by .github/workflows/agentics-maintenance.yml; the prior
  recompile dropped it because pr-review-panel and triage-panel don't use it).
- Sanitise group_id with strict allowlist (^[A-Za-z0-9_-]+$) before
  interpolating into the shell path, as defence-in-depth even though apm-prep
  produces a sanitised id today.
- Guard jq '.group | length' against null with '.group // []'.
- Link upstream actions/download-artifact source in the comment so future
  maintainers can verify the workaround is still needed.
- CHANGELOG entry under [Unreleased] -> Fixed referencing #1051.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@danielmeppiel
Copy link
Copy Markdown
Collaborator Author

Resolved panel review findings in f231989:

Required (Supply Chain Security):

  • Restored github/gh-aw-actions/setup-cli@v0.71.2 entry in .github/aw/actions-lock.json (still referenced by agentics-maintenance.yml; prior recompile dropped it because pr-review-panel/triage-panel don't import it).

Adopted nits:

  • Sanitised group_id with strict allowlist (^[A-Za-z0-9_-]+$) before interpolation into the shell path -- defence-in-depth even though apm-prep already produces a sanitised id (Supply Chain Security).
  • Guarded jq '.group | length' against null with .group // [] (Python Architect).
  • Added upstream actions/download-artifact source link in the comment so future maintainers can verify the workaround is still needed (DevX UX).
  • CHANGELOG entry under [Unreleased] -> Fixed (OSS Growth Hacker).

Set aside (with rationale):

  • -prune over ! -path (Python Architect): functionally equivalent here; ! -path is the more readable form for a single exclusion.
  • echo '[normalise] ...' trace lines (CLI Logging Expert): step is silent on success by design (matches surrounding apm.md steps); failures already surface via set -euo pipefail and the explicit ::error:: line for the security guard.
  • Lock-file comments preservation (DevX UX): inherent gh-aw limitation, out of scope for this PR.

Recompiled both lock files. Pushing should auto-clear panel-rejected.

@github-actions github-actions Bot removed the panel-rejected Apm-review-panel verdict: REJECT. Removed automatically on next push. label Apr 29, 2026
@danielmeppiel danielmeppiel merged commit 65b7dd4 into main Apr 29, 2026
9 checks passed
@danielmeppiel danielmeppiel deleted the fix/shared-apm-single-artifact-flatten branch April 29, 2026 20:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants