fix(compile): stabilize __BUILD_ID__ in all compiled outputs#959
Open
edenfunf wants to merge 2 commits intomicrosoft:mainfrom
Open
fix(compile): stabilize __BUILD_ID__ in all compiled outputs#959edenfunf wants to merge 2 commits intomicrosoft:mainfrom
edenfunf wants to merge 2 commits intomicrosoft:mainfrom
Conversation
apm compile wrote literal '<!-- Build ID: __BUILD_ID__ -->' to disk on every target except the legacy --single-agents path. The substitution that constants.py promises lived only in compile/cli.py:506-524 and was gated by `if config.strategy == "distributed" and not single_agents`. PR microsoft#468 (which closed microsoft#467) added cross-platform sort determinism but never extended substitution to the distributed/claude/gemini paths; its verification ran on --single-agents only. Affected paths producing literal __BUILD_ID__ on main: - apm compile (default distributed): AGENTS.md + every sub-dir - apm compile -t claude: CLAUDE.md - apm compile -t gemini: GEMINI.md - apm compile --watch (single-file): AGENTS.md - compile_agents_md() public API: AGENTS.md Rather than repeat PR microsoft#468's sibling-by-sibling pattern (4 files, all needing the same call site), this introduces a single chokepoint: compilation/output_writer.py::CompiledOutputWriter All five compiled-output write sites route through writer.write(...). The writer extracts stabilize_build_id() (preserved verbatim from compile/cli.py), asserts no placeholder survives stabilization (defense-in-depth: future bypass raises RuntimeError instead of silently emitting __BUILD_ID__), creates parent dirs, and writes atomically. Direct Path.write_text / open(...).write on compiled output is a contract violation by design. The atomic write primitive was duplicated in commands/_helpers. Consolidated to utils/atomic_io.atomic_write_text; commands/_helpers keeps a one-line alias for backward compatibility with existing tests. Closes microsoft#957
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Closes #957.
apm compilewrites literal<!-- Build ID: __BUILD_ID__ -->to disk on every target except--single-agents. The placeholder substitution thatconstants.pypromises ("a deterministic Build ID (content hash) is substituted post-generation") lives only incompile/cli.py:506-524and is gated byif config.strategy == "distributed" and not single_agents-- only the legacy single-file path reaches it.PR #468 (which closed #467) added cross-platform sort determinism but never extended substitution to distributed / claude / gemini formatters; its diff only touches
sorted(...)calls. The verification claim "Both now produce<!-- Build ID: e316dee8ef7f -->" matches the--single-agentspath only -- the defaultapm compileand every multi-target setup still write__BUILD_ID__literally onmain.Affected paths producing literal
__BUILD_ID__onmain(666925f):apm compile(default distributed)agents_compiler._write_distributed_fileAGENTS.md(root + every distributed sub-dir)apm compile -t claudeagents_compiler._compile_claude_md(inline)CLAUDE.mdapm compile -t geminiagents_compiler._compile_gemini_md(inline)GEMINI.mdapm compile --watch(single-file)agents_compiler._write_output_fileAGENTS.mdcompile_agents_md()(public API)agents_compiler._write_output_fileAGENTS.mdWhy a chokepoint, not a per-formatter patch
PR #468's structural pattern was "fix the same concern in 4 sibling files". The straightforward way to fix this issue would be to add a
stabilize_build_id()call at each of the five write sites -- and inherit the same fragility. Any future target re-exposes the bug.This PR introduces a single chokepoint:
All five compiled-output write sites route through
writer.write(path, content). The writer:stabilize_build_id(content)(extracted fromcompile/cli.py, algorithm preserved verbatim).BUILD_ID_PLACEHOLDERsurvives stabilization. A future code path that bypasses or breaks stabilization raisesRuntimeErrorrather than silently emitting__BUILD_ID__.Direct
Path.write_text/open(...).writeon compiled output is a contract violation by design -- adding a new target without using the writer fails loudly.Implementation
src/apm_cli/compilation/build_id.py(new, 41 lines)stabilize_build_id(content)-- pure, idempotent helper; algorithm preserved fromcompile/cli.py:506-524.src/apm_cli/compilation/output_writer.py(new, 49 lines)CompiledOutputWriterchokepoint with stabilize -> assert -> atomic-write contract. Error contract documented in module docstring (OSErrorpropagates to callers;RuntimeErroris intentionally NOT caught by callers'except OSErrorblocks).src/apm_cli/utils/atomic_io.py(new, 39 lines)atomic_write_text(path, data)-- single canonical implementation. The pre-existing_atomic_writeincommands/_helpers.pywas a near-duplicate; consolidating here avoids reintroducing the same sibling-duplication anti-pattern this PR is fixing.src/apm_cli/commands/_helpers.py_atomic_writereduced to a one-line alias ofatomic_write_text. Backward-compatible with existing tests and any external caller.src/apm_cli/commands/compile/cli.py(-22 lines)splitlines/hashlib.sha256/ line replacement); single-file path now callsCompiledOutputWriter().write(...).src/apm_cli/compilation/agents_compiler.py(+11 / -14)_write_distributed_file, claude inline, gemini inline,_write_output_file) route through the writer.Type of change
Testing
mainare unrelated to this change -- same set fails before and after.)New test coverage
tests/unit/compilation/test_build_id.py(9 cases): replace placeholder / idempotent / deterministic / no-placeholder pass-through / different content yields different hash / hash excludes placeholder line itself / preserves trailing newline / empty content / only-placeholder line.tests/unit/compilation/test_output_writer.py(6 cases): stabilizes before write / creates parent dirs / UTF-8 encoding / passes through clean content / raises on unresolvable placeholder (defense-in-depth) / atomic semantics -- pre-existing target untouched onos.replacefailure.Updated tests
tests/unit/compilation/test_agents_compiler_coverage.py::test_write_output_file_oserror_adds_error-- writer now auto-creates parent dirs, so the missing-parent failure mode is replaced with amonkeypatch os.replaceraise to exercise the sameOSErrorcodepath.tests/unit/test_command_helpers.py::test_cleans_up_temp_file_on_write_error-- leftover-file glob made prefix-agnostic so a future temp-file prefix change does not silent-pass.End-to-end verification
On a fixture with
apm.yml,src/main.py, and two instruction files (one withapplyTo: "**/*.py", one withapplyTo: "tests/**/*.py"):apm compileAGENTS.md(root) +src/AGENTS.md(distributed sub-dir)d659e6183d03/a91eb45e9bbfapm compile --single-agentsAGENTS.md3df8bca8558aapm compile -t claudeCLAUDE.md46a595669d48apm compile -t geminiGEMINI.mdb25ecb1630a5compile_agents_md()(public API)AGENTS.md1cb5828b4dd3Verified:
__BUILD_ID__literals remain in any output..instructions.mdchanges only the affected target's hash (drift detection works as documented in [BUG] apm compile produces different Build IDs on macOS vs Linux #467's user story).git stashof this PR's changes reproduces the original bug -- all five paths write__BUILD_ID__literally -- confirming the fix is necessary and sufficient.