Skip to content

feat(install, compile): --root DIR to redirect writes while sources resolve from $PWD#928

Draft
srid wants to merge 5 commits intomicrosoft:mainfrom
juspay:feat/install-compile-root-flag
Draft

feat(install, compile): --root DIR to redirect writes while sources resolve from $PWD#928
srid wants to merge 5 commits intomicrosoft:mainfrom
juspay:feat/install-compile-root-flag

Conversation

@srid
Copy link
Copy Markdown
Contributor

@srid srid commented Apr 25, 2026

Closes #888.

Downstream consumer validating this PR end-to-end: juspay/kolu#716 — collapses a ~60-line rsync + mkdir + cp scratch-staging workaround in their CI pipeline to a four-line apm install --root + apm compile --root sequence.

Why

apm install and apm compile always write relative to $PWD, which forces any scratch-directory workflow to physically stage every input file (manifest, .apm/, lockfile, project tree) before cd-ing in. The juspay/kolu CI pipeline hit this in a sharp way (#684): running apm install on the live worktree was briefly destroying .claude/ while a Claude Code session had an open file handle there. The workaround exists purely because there was no --target-style flag.

This PR adds that flag: apm install --root DIR and apm compile --root DIR. Writes go under DIR; sources (apm.yml, .apm/, local-path packages, the project tree used for distributed compile placement scoring) continue reading from $PWD. Semantics deliberately mirror pip install --target and npm install --prefix.

With both flags in place the kolu workaround collapses to:

[[ ! -e apm.lock.yaml ]] || cp apm.lock.yaml "$scratch/apm.lock.yaml"
apm install --root "$scratch"
apm compile --root "$scratch" --target codex,opencode

How

The implementation picks a chdir + source-root pin model over refactoring the long tail of Path.cwd() / os.getcwd() call sites (notably the MCP adapters in apm_cli.adapters.client.*, the OpenCode/Cursor/VSCode adapters, and mcp_integrator.py). A new context manager apm_cli.install.root_redirect.install_root_redirect does two things:

  1. os.chdir(root) so every site that hardcodes Path.cwd() auto-resolves to the deploy root.
  2. set_source_root_override($PWD) (process-global) so helpers that need the user's source tree (get_source_root, get_manifest_path) still point at the original working directory.

Both are restored in a finally so the context doesn't leak across invocations — important for test runners, REPL sessions, and embedded callers. compile_root_redirect is a one-line alias for the same implementation.

get_manifest_path is decoupled from get_apm_dir: the manifest is a source, so it follows get_source_root, while the lockfile / apm_modules/ / deploy dirs follow get_deploy_root. That distinction is the design crux — the issue body explicitly called out "sources continue resolving from $PWD" as the intended semantics.

Install side

  • InstallContext gains source_root (defaults to project_root for back-compat via __post_init__).
  • install/pipeline.py computes source_root = get_source_root(scope) alongside project_root = get_deploy_root(scope).
  • Four places that previously conflated read-from-$PWD with write-to-$PWD now split: _project_has_root_primitives (scans source root for .apm/), integrate_local_content's synthetic _local package (install_path = source_root), the dep resolver's apm.yml lookup, and local-path package resolution in phases/resolve.py + sources.py.

The resolver fix is its own commit because the tree-shaped symptom ("running apm install --root silently resolved zero direct dependencies") is worth a standalone reviewable change.

Compile side

AgentsCompiler and DistributedAgentsCompiler gain an optional source_dir parameter (defaults to base_dir for back-compat). base_dir keeps driving placement targets and write paths; source_dir drives primitive discovery, ContextOptimizer, and constitution lookup. DistributedAgentsCompiler._source_to_base translates placement-map keys from source-rooted to base-rooted so the AGENTS.md writes land at the deploy root even though the placement decisions were scored against the source tree. template_builder.build_conditional_sections picks up a matching source_dir parameter so <!-- Source: ... --> display paths render relative to $PWD — without this, scratch-compiled output diverges from in-place output by embedding absolute scratch paths.

What's included / what's not

Source resolution for apm.yml is tied to get_source_root, which currently means the original CWD. Adding a separate --manifest PATH flag for reading the manifest from an arbitrary location (called out as future work in the issue body) is deliberately not in this PR — the current design accommodates it cleanly by having get_manifest_path consult an additional override, but that's a future increment.

The compiler's source_dir is CLI-aware by necessity because distributed-compile placement scoring scans the project tree; a pre-resolved path from the caller would be fine except that ContextOptimizer is constructed inside DistributedAgentsCompiler.__init__ and already took a directory. Threading two paths through is the minimal change.

Verification

End-to-end validated against juspay/kolu#716: apm audit --ci passes 7/7 and every AGENTS.md output (root + packages/ + packages/client/src/ + packages/tests/features/ + packages/server/src/) is byte-identical between an apm install --root "$scratch" && apm compile --root "$scratch" run and the in-place equivalent.

Draft status

Drafted so the design can be sanity-checked before it leaves draft. In particular, feedback welcome on:

  • The chdir-vs-refactor tradeoff. The Path.cwd() long tail is real and this PR sidesteps it; the alternative is a sweep through the adapters to use scope helpers, which is a larger and probably-separate change. Is that acceptable?
  • The source_dir leak into the compiler classes. Principled ("the compiler already knew about roots") or a smell ("CLI-level concept leaking into a pure library")? I picked principled-enough-for-now.
  • Process-global override vs plumbed parameter. A global is ugly but the chdir is already global. Open to being talked out of it.

🤖 Generated with Claude Code

srid added 4 commits April 24, 2026 21:06
`apm install --root <dir>` writes apm_modules/, apm.lock.yaml, and
runtime deployment dirs (.claude/, .codex/, .agents/, .opencode/)
under <dir> while sources (apm.yml, .apm/, local-path packages)
continue resolving from $PWD.

Mirrors `pip install --target` and `npm install --prefix`. Useful
for scratch-dir verification (microsoft#684), bootstrap
scripts, and fixture generation -- closes microsoft#888.

Implementation
- core/scope.py: process-global deploy-root override
  (set_deploy_root_override) plus a separate get_source_root() that
  always resolves to $PWD.  get_manifest_path is decoupled from
  get_apm_dir so the manifest stays in $PWD even when writes
  redirect.
- install/context.py: InstallContext gains source_root, defaulting
  to project_root for back-compat.
- install/pipeline.py: passes source_root into the context.
  _project_has_root_primitives runs against source_root.
- install/services.py: integrate_local_content takes an optional
  source_root for the synthetic _local package's install_path.
- install/phases/{resolve,integrate}.py: thread source_root through
  to local-package resolution and local-content integration.
- commands/install.py: --root option, mutually exclusive with
  --global; sets the override at entry, clears it in finally so
  no global state leaks across invocations.
Replace the deploy-root override with a chdir-based redirect so every
existing site that hardcodes Path.cwd() / os.getcwd() (notably the
MCP adapters in apm_cli.adapters.client.*) automatically resolves
to the deploy root.  Sources keep reading from the original $PWD
via set_source_root_override.

Why chdir
- The previous override only covered scope helpers
  (get_deploy_root/get_apm_dir).  MCP adapters bypass those helpers
  and write directly to Path.cwd() / opencode.json, .vscode/mcp.json,
  .cursor/mcp.json, etc.  Refactoring the long tail of cwd/getcwd
  call-sites is more invasive than the chdir trick and would block
  this feature on a wider cleanup.

Implementation
- new module: apm_cli.install.root_redirect.install_root_redirect
  context manager (chdir + source-root pin, restored on exit).
- core/scope.py: replace deploy override with source-root override;
  get_manifest_path now derives from get_source_root.
- commands/install.py: bracket the handler body with the context
  manager (enter at top, __exit__ in finally).  The Click option
  block is compressed onto one line per option to recover budget.
- tests/unit/install/test_architecture_invariants.py: bump LOC budget
  1700 -> 1725 with rationale (mirrors the prior PR pattern); the
  pending --mcp extraction recovers this budget.
The dependency resolver loads ``project_root / "apm.yml"`` for the
root manifest.  Before this fix it received ``ctx.apm_dir`` -- which
under ``apm install --root`` points at the (typically empty) deploy
directory, causing the resolver to silently return zero direct deps.

Pass ``ctx.source_root`` instead so the manifest resolves from $PWD
even when writes redirect.  Falls back to ``ctx.apm_dir`` when no
override is active so the default path stays unchanged.
Sources continue resolving from $PWD; AGENTS.md / CLAUDE.md outputs
land under DIR.  Pairs with `apm install --root` for scratch-dir
verification (microsoft#888) -- the install + compile combo
needs no rsync, cd-gymnastics, or symlinks.

Implementation
- AgentsCompiler / DistributedAgentsCompiler take an optional
  source_dir parameter (defaults to base_dir for back-compat).
  base_dir continues to drive write paths and placement targets;
  source_dir is used for primitive discovery, project-tree scoring
  (ContextOptimizer), and constitution lookup.
- DistributedAgentsCompiler._source_to_base translates the placement
  map keys from source-dir-rooted to base-dir-rooted so writes land
  at the deploy root.
- template_builder.build_conditional_sections takes an optional
  source_dir to compute display-relative paths in `<!-- Source: -->`
  comments.  Without this, scratch-compiled output renders absolute
  source paths and diverges from in-place compile output.
- distributed_compiler's per-instruction source attribution renders
  paths relative to source_dir (was self.base_dir).
- commands/compile/cli.py: --root option, brackets the handler with
  compile_root_redirect (alias for install_root_redirect; identical
  chdir + source-root pin).  Source-root reads (apm.yml existence,
  .apm/ scan, find_constitution, target detection, AgentsCompiler)
  go through get_source_root.
- install/root_redirect.py: re-exports the helper as
  compile_root_redirect; the two commands share one implementation.
@srid
Copy link
Copy Markdown
Contributor Author

srid commented Apr 25, 2026

Pre-merge structural review (Hickey + Lowy)

I ran two structural-review passes on this branch — one Hickey-style ("Simple Made Easy", looking for complecting and missed deduplication) and one Lowy-style (volatility-based decomposition, after Parnas '72). Background on what each pass checks: kolu.dev/blog/hickey-lowy.

Posting both before maintainer review since the two passes disagree on the central design choice and I'd rather hear your opinion than commit to one direction blindly.

Where they converge

Correctness bug under --root (Hickey, must-fix). AgentsCompiler relativises primitive/instruction file paths against self.base_dir at three sites — agents_compiler.py:695, agents_compiler.py:707, agents_compiler.py:872. When --root is active, those source files live under source_dir and base_dir is the deploy root; the two don't share a common parent, so portable_relpath either produces a confusing ../../… path or falls back to absolute. DistributedAgentsCompiler:594 already correctly uses self.source_dir — the asymmetry is the smoking gun. Will fix regardless of which design direction you prefer.

Missing cross-reference docstring on get_lockfile_dir (Hickey, must-fix). core/scope.py:154 is the boundary where a future maintainer will pick the wrong root. get_manifest_path has the right docstring; get_lockfile_dir doesn't. Easy fix.

Where they disagree

The headline: chdir + process-global override vs. just-the-override-and-refactor-cwd-callsites.

  • Hickey considers the braid "deliberate, justified" (the docstring explains it; chdir handles the long tail of Path.cwd() callsites in MCP adapters; the override handles scope helpers — two mechanical parts of one coherent strategy). Optional finding only.

  • Lowy considers it the must-fix problem: "two separate mechanisms for what should be one concept", non-hermetic, and creates a coupling debt where future MCP-adapter additions silently won't see _SOURCE_ROOT_OVERRIDE. Recommends eliminating chdir() and routing every Path.cwd() callsite through get_source_root() or an explicit parameter — a sweep through apm_cli.adapters.client.* and mcp_integrator.py.

This is the tradeoff I called out in the PR description's "draft status" section; the two reviewers come down on opposite sides. I lean Hickey's way (smaller diff, contained surface) but Lowy's argument that the long tail is the volatility worth encapsulating is real. Maintainer call: which?

Lowy's other architectural pushes

  • InstallContext.source_root Optional default is backwards (should-fix). Resolve at the CLI boundary, make the field non-optional, drop the ctx.source_root or ctx.project_root fallback patterns in phases/resolve.py and pipeline.py. I think this is right and would do it.

  • Compiler source_dir leaks CLI concept into domain (should-fix). The compiler should receive pre-resolved paths and not know about --root. The _source_to_base() helper is "a guard against the feature, not a design". Counterpoint: ContextOptimizer is constructed inside DistributedAgentsCompiler.__init__ and already takes a directory, so threading a second one is the minimal change. Genuinely uncertain — open to either direction.

Hickey's other style pushes

  • Manual __enter__/__exit__ instead of with (should-fix). The handlers hold a _root_redirect reference and unwind in finally. with would be cleaner — install_root_redirect is already @contextmanager. Trivial to fix; the original avoidance was about not re-indenting the existing large try body, which is a weak reason.

  • Convention enforced by naming alone, not types (should-fix). The "use source_root to read, project_root to write" rule is convention; a phase can read the wrong field and the bug is silent in the no---root case. Suggested fix: convention note at the top of install/pipeline.py. Cheaper than a full type-level fix.

Optional / defer

  • compile_root_redirect = install_root_redirect alias should carry a comment explaining the intentional sharing and split conditions (Lowy).
  • A future --manifest <path> flag would re-thread cleanly if source_root becomes non-Optional (Lowy) — argument for landing the should-fix above.

Asking

The MUST-FIX correctness bug + docstring will get fixed regardless. For the rest, especially the chdir-vs-refactor call, I'd rather have your direction before I rewrite. Happy to push either way; the kolu downstream consumer (juspay/kolu#716) is in draft pending this PR's resolution and isn't blocking.

🤖 Reviews generated with structural-analysis subagents via Claude Code (background: kolu.dev/blog/hickey-lowy).

AgentsCompiler.validate_primitives and the verbose-output helper at
the foot of agents_compiler.py rendered primitive / instruction file
paths via `portable_relpath(file_path, self.base_dir)`.  Under
`apm compile --root`, source files live under `source_dir` while
`base_dir` is the deploy root -- the two don't share a common parent,
so the relpath either returned `../../...` chains or the absolute
path.  DistributedAgentsCompiler already does this right at
distributed_compiler.py:594; this closes the asymmetry.

Also documents the source-vs-deploy routing convention on
`get_lockfile_dir` so a future maintainer adding a new metadata
helper picks the right root without tracing callers.

Found by a Hickey-style structural review of the --root branch:
microsoft#928 (comment)
@srid
Copy link
Copy Markdown
Contributor Author

srid commented Apr 25, 2026

Pushed fix for the two MUST-FIX items (commit 19c20e1):

  • AgentsCompiler relpath fixvalidate_primitives + verbose-output helper now use self.source_dir for primitive/instruction paths, matching DistributedAgentsCompiler:594. Verified end-to-end: apm install --root "$scratch" && apm compile --root "$scratch" --target codex,opencode against the juspay/kolu worktree produces AGENTS.md + packages/AGENTS.md that are byte-identical to an in-place compile. 617 unit tests across tests/unit/{compilation,commands,install} all pass.

  • get_lockfile_dir docstring — now parallels get_manifest_path's, spelling out the source-vs-deploy routing convention so future metadata helpers pick the right root deliberately.

The should-fix / chdir-vs-refactor items still await your call. Draft stays draft until then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

enhancement: apm install --root <dir> to target a directory other than $PWD

1 participant