Skip to content

fix(compile): stabilize __BUILD_ID__ in all compiled outputs#959

Open
edenfunf wants to merge 2 commits intomicrosoft:mainfrom
edenfunf:fix/compile-build-id-placeholder
Open

fix(compile): stabilize __BUILD_ID__ in all compiled outputs#959
edenfunf wants to merge 2 commits intomicrosoft:mainfrom
edenfunf:fix/compile-build-id-placeholder

Conversation

@edenfunf
Copy link
Copy Markdown
Contributor

@edenfunf edenfunf commented Apr 26, 2026

Description

Closes #957.

apm compile writes literal <!-- Build ID: __BUILD_ID__ --> to disk on every target except --single-agents. The placeholder substitution that constants.py promises ("a deterministic Build ID (content hash) is substituted post-generation") lives only in compile/cli.py:506-524 and is gated by if config.strategy == "distributed" and not single_agents -- only the legacy single-file path reaches it.

PR #468 (which closed #467) added cross-platform sort determinism but never extended substitution to distributed / claude / gemini formatters; its diff only touches sorted(...) calls. The verification claim "Both now produce <!-- Build ID: e316dee8ef7f -->" matches the --single-agents path only -- the default apm compile and every multi-target setup still write __BUILD_ID__ literally on main.

Affected paths producing literal __BUILD_ID__ on main (666925f):

Path Caller Output
apm compile (default distributed) agents_compiler._write_distributed_file AGENTS.md (root + every distributed sub-dir)
apm compile -t claude agents_compiler._compile_claude_md (inline) CLAUDE.md
apm compile -t gemini agents_compiler._compile_gemini_md (inline) GEMINI.md
apm compile --watch (single-file) agents_compiler._write_output_file AGENTS.md
compile_agents_md() (public API) agents_compiler._write_output_file AGENTS.md

Why a chokepoint, not a per-formatter patch

PR #468's structural pattern was "fix the same concern in 4 sibling files". The straightforward way to fix this issue would be to add a stabilize_build_id() call at each of the five write sites -- and inherit the same fragility. Any future target re-exposes the bug.

This PR introduces a single chokepoint:

src/apm_cli/compilation/output_writer.py::CompiledOutputWriter

All five compiled-output write sites route through writer.write(path, content). The writer:

  1. Calls stabilize_build_id(content) (extracted from compile/cli.py, algorithm preserved verbatim).
  2. Defensively asserts no BUILD_ID_PLACEHOLDER survives stabilization. A future code path that bypasses or breaks stabilization raises RuntimeError rather than silently emitting __BUILD_ID__.
  3. Creates parent dirs and writes atomically (replace-on-rename, so a crash mid-write cannot corrupt a pre-existing target).

Direct Path.write_text / open(...).write on compiled output is a contract violation by design -- adding a new target without using the writer fails loudly.

Implementation

File Change
src/apm_cli/compilation/build_id.py (new, 41 lines) stabilize_build_id(content) -- pure, idempotent helper; algorithm preserved from compile/cli.py:506-524.
src/apm_cli/compilation/output_writer.py (new, 49 lines) CompiledOutputWriter chokepoint with stabilize -> assert -> atomic-write contract. Error contract documented in module docstring (OSError propagates to callers; RuntimeError is intentionally NOT caught by callers' except OSError blocks).
src/apm_cli/utils/atomic_io.py (new, 39 lines) atomic_write_text(path, data) -- single canonical implementation. The pre-existing _atomic_write in commands/_helpers.py was a near-duplicate; consolidating here avoids reintroducing the same sibling-duplication anti-pattern this PR is fixing.
src/apm_cli/commands/_helpers.py _atomic_write reduced to a one-line alias of atomic_write_text. Backward-compatible with existing tests and any external caller.
src/apm_cli/commands/compile/cli.py (-22 lines) Removed inline hash logic (splitlines / hashlib.sha256 / line replacement); single-file path now calls CompiledOutputWriter().write(...).
src/apm_cli/compilation/agents_compiler.py (+11 / -14) All four compiled-output write sites (_write_distributed_file, claude inline, gemini inline, _write_output_file) route through the writer.

Type of change

  • Bug fix
  • New feature
  • Documentation
  • Maintenance / refactor

Testing

  • Tested locally
  • All existing tests pass -- 5511 unit tests, 1 skipped, 1 warning. Compile-related integration tests: 24 passed, 1 skipped. (8 pre-existing integration failures on main are unrelated to this change -- same set fails before and after.)
  • Added tests for new functionality

New test coverage

  • tests/unit/compilation/test_build_id.py (9 cases): replace placeholder / idempotent / deterministic / no-placeholder pass-through / different content yields different hash / hash excludes placeholder line itself / preserves trailing newline / empty content / only-placeholder line.
  • tests/unit/compilation/test_output_writer.py (6 cases): stabilizes before write / creates parent dirs / UTF-8 encoding / passes through clean content / raises on unresolvable placeholder (defense-in-depth) / atomic semantics -- pre-existing target untouched on os.replace failure.

Updated tests

  • tests/unit/compilation/test_agents_compiler_coverage.py::test_write_output_file_oserror_adds_error -- writer now auto-creates parent dirs, so the missing-parent failure mode is replaced with a monkeypatch os.replace raise to exercise the same OSError codepath.
  • tests/unit/test_command_helpers.py::test_cleans_up_temp_file_on_write_error -- leftover-file glob made prefix-agnostic so a future temp-file prefix change does not silent-pass.

End-to-end verification

On a fixture with apm.yml, src/main.py, and two instruction files (one with applyTo: "**/*.py", one with applyTo: "tests/**/*.py"):

Target Output Build ID
apm compile AGENTS.md (root) + src/AGENTS.md (distributed sub-dir) d659e6183d03 / a91eb45e9bbf
apm compile --single-agents AGENTS.md 3df8bca8558a
apm compile -t claude CLAUDE.md 46a595669d48
apm compile -t gemini GEMINI.md b25ecb1630a5
compile_agents_md() (public API) AGENTS.md 1cb5828b4dd3

Verified:

  • Zero __BUILD_ID__ literals remain in any output.
  • Re-running the same compile reproduces the same hash (deterministic).
  • Modifying an .instructions.md changes only the affected target's hash (drift detection works as documented in [BUG] apm compile produces different Build IDs on macOS vs Linux #467's user story).
  • Reverting the modification reproduces the original hash byte-for-byte (byte-stable rebuild).
  • git stash of this PR's changes reproduces the original bug -- all five paths write __BUILD_ID__ literally -- confirming the fix is necessary and sufficient.

apm compile wrote literal '<!-- Build ID: __BUILD_ID__ -->' to disk on
every target except the legacy --single-agents path. The substitution
that constants.py promises lived only in compile/cli.py:506-524 and was
gated by `if config.strategy == "distributed" and not single_agents`.
PR microsoft#468 (which closed microsoft#467) added cross-platform sort determinism but
never extended substitution to the distributed/claude/gemini paths;
its verification ran on --single-agents only.

Affected paths producing literal __BUILD_ID__ on main:
- apm compile (default distributed): AGENTS.md + every sub-dir
- apm compile -t claude: CLAUDE.md
- apm compile -t gemini: GEMINI.md
- apm compile --watch (single-file): AGENTS.md
- compile_agents_md() public API: AGENTS.md

Rather than repeat PR microsoft#468's sibling-by-sibling pattern (4 files, all
needing the same call site), this introduces a single chokepoint:

  compilation/output_writer.py::CompiledOutputWriter

All five compiled-output write sites route through writer.write(...).
The writer extracts stabilize_build_id() (preserved verbatim from
compile/cli.py), asserts no placeholder survives stabilization
(defense-in-depth: future bypass raises RuntimeError instead of
silently emitting __BUILD_ID__), creates parent dirs, and writes
atomically. Direct Path.write_text / open(...).write on compiled
output is a contract violation by design.

The atomic write primitive was duplicated in commands/_helpers.
Consolidated to utils/atomic_io.atomic_write_text; commands/_helpers
keeps a one-line alias for backward compatibility with existing tests.

Closes microsoft#957
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

2 participants