Skip to content

Prototype repository-driven OCI package resolution#900

Draft
vlsi wants to merge 3 commits intomicrosoft:mainfrom
vlsi:feat/repository-driven-oci-resolution
Draft

Prototype repository-driven OCI package resolution#900
vlsi wants to merge 3 commits intomicrosoft:mainfrom
vlsi:feat/repository-driven-oci-resolution

Conversation

@vlsi
Copy link
Copy Markdown

@vlsi vlsi commented Apr 24, 2026

Prototype repository-driven package resolution for APM dependencies.

This changes dependency handling so apm.yml can declare logical package requirements like owner/repo#version without embedding the transport type in the dependency itself. Resolution is performed through configured repositories, with initial support for Git and OCI backends.

How it works

  • OCI repositories are expected to contain a *.tar.gz artifact with raw APM package sources.
  • Dependency declarations in apm.yml do not depend on the repository that stores the dependency.
    This means the same package declaration can be resolved from different repositories based on client configuration.
  • Users configure repositories in a separate file, so the same apm.yml can be used unchanged in open source and in corporate environments.
  • The resolver tries configured repositories in order and records the resolved source in the lockfile.

Notes

This is still a prototype.

Current limitations:

  • repository configuration is file-based
  • OCI fetch currently uses oras
  • OCI publish flow and media-type enforcement are not implemented yet
  • resolution remains intentionally simple and tries configured repositories in order

Type of change

  • Bug fix
  • New feature
  • Documentation
  • Maintenance / refactor

Testing

  • Tested locally
  • All existing tests pass
  • Added tests for new functionality (if applicable)

Copilot AI review requested due to automatic review settings April 24, 2026 11:21
@vlsi vlsi marked this pull request as draft April 24, 2026 11:22
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR prototypes a transport-agnostic dependency model where apm.yml can declare logical requirements (for example owner/repo#version) and have them resolved via a client-side repository configuration, with initial Git and OCI backends and lockfile annotations of the resolved source.

Changes:

  • Add PackageRequirement and parse manifest dependency entries into logical requirements when applicable.
  • Introduce repository configuration (~/.apm/repositories.yml) plus an ArtifactResolver that tries configured repositories (git/oci) in priority order.
  • Add an OCI fetch path using oras pull and basic tarball extraction/validation, plus lockfile fields to record resolved repository/source metadata.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
tests/unit/test_repository_resolution.py Adds unit coverage for parsing requirements, default repositories, resolver annotation, and OCI tarball extraction.
src/apm_cli/repositories/resolver.py Implements repository-driven artifact resolution and fetch for git and oci backends.
src/apm_cli/repositories/config.py Adds repository config model, default repositories, and YAML loader for user config.
src/apm_cli/models/dependency/requirement.py Introduces the PackageRequirement model with parsing and install-path computation.
src/apm_cli/models/dependency/init.py Re-exports PackageRequirement alongside existing dependency models.
src/apm_cli/models/apm_package.py Updates manifest parsing to emit PackageRequirement and returns mixed APM dependency types.
src/apm_cli/deps/oci_registry.py Adds OCI pull and tar.gz extraction into an install directory, then validates as an APM package.
src/apm_cli/deps/lockfile.py Extends lockfile schema to store resolved repository/source_type/OCI metadata and alters unique-key behavior for OCI.
src/apm_cli/commands/uninstall/engine.py Extends uninstall parsing to accept PackageRequirement and adjusts some cleanup logic.
src/apm_cli/commands/install.py Integrates ArtifactResolver into install flow and adapts --only parsing and download logic for requirements.

Comment on lines +53 to +61
name, version = value, None
if "#" in value:
name, version = value.rsplit("#", 1)
version = version.strip() or None
name = name.strip().strip("/")
if not name or "/" not in name:
raise ValueError("Package requirement must be in 'owner/repo' form")
validate_path_segments(name, context="package name")
return cls(name=name, version=version)
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PackageRequirement.parse() accepts strings where the first segment looks like a hostname (contains a '.'), e.g. "gitlab.com/group/repo". Other call sites (install --only parsing, uninstall parsing) attempt PackageRequirement.parse() before DependencyReference.parse(), so host-qualified git dependencies can be misclassified as requirements and end up with incorrect install/uninstall paths. Consider rejecting host-qualified inputs here (e.g., if '.' in first segment) to keep parsing consistent across the CLI.

Copilot uses AI. Check for mistakes.
Comment on lines 195 to +200
for dep_str in updated_data.get("dependencies", {}).get("apm", []) or []:
try:
ref = _parse_dependency_entry(dep_str)
remaining_deps.add(ref.get_unique_key())
except (ValueError, TypeError, AttributeError, KeyError):
remaining_deps.add(dep_str)
remaining_deps.add(str(dep_str))
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uninstall orphan detection mixes lockfile keys (LockedDependency.get_unique_key()) with manifest keys (ref.get_unique_key()). For OCI deps, the lockfile key becomes "oci::", but PackageRequirement.get_unique_key() returns just "owner/repo". This can cause OCI dependencies to be considered orphans even when still present in apm.yml, leading to incorrect deletions. Use a consistent identity (e.g., compare by dep.repo_url + virtual_path/local_path) or translate manifest requirements into the same lockfile key format before subtracting.

Copilot uses AI. Check for mistakes.
Comment on lines +126 to +130
dep_ref = _parse_dependency_entry(package)
if not hasattr(dep_ref, "get_install_path"):
logger.progress(f"Removed {package} from manifest (OCI dependency)")
removed += 1
continue
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hasattr(dep_ref, "get_install_path") branch is effectively dead: both DependencyReference and PackageRequirement implement get_install_path(), so this path (and the "OCI dependency" message) is unlikely to be hit and may confuse future maintenance. Prefer explicit isinstance checks if a special case is needed, otherwise remove the branch.

Copilot uses AI. Check for mistakes.
Comment on lines +29 to +31
def repositories_config_path() -> Path:
"""Return the repository config path."""
return Path.home() / ".apm" / "repositories.yml"
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change introduces a new user-facing dependency form (logical requirements) and a new repositories config file (~/.apm/repositories.yml) with git/oci backends, but no documentation updates are included. Please update the Starlight docs (docs/src/content/docs/) and the apm-guide skill docs (packages/apm-guide/.apm/skills/apm-usage/, especially dependencies.md) to describe the new syntax, resolution order, and config file format.

Copilot uses AI. Check for mistakes.
Comment thread src/apm_cli/deps/lockfile.py Outdated
Comment on lines +49 to +53
"""Returns unique key for this dependency."""
if self.source_type == "oci":
registry = self.oci_registry or "default"
repository = self.oci_repository or self.repo_url
return f"oci:{registry}:{repository}"
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LockedDependency.get_unique_key() now returns an "oci::" key for OCI deps. Several existing code paths look up lockfile entries by dep_ref.get_unique_key()/repo_url (e.g., prune/orphan detection uses "owner/repo" keys), so OCI entries will no longer be found and downstream cleanup/lookup logic can silently misbehave. Consider keeping lockfile keys stable (e.g., always key by repo_url/virtual/local identity) and storing OCI transport details in fields, or add an explicit secondary index/lookup that maps repo_url -> lockfile entry for OCI deps.

Suggested change
"""Returns unique key for this dependency."""
if self.source_type == "oci":
registry = self.oci_registry or "default"
repository = self.oci_repository or self.repo_url
return f"oci:{registry}:{repository}"
"""Return the stable lockfile lookup key for this dependency.
The key must remain aligned with dependency reference identity so that
lockfile lookups, prune/orphan detection, and reconciliation logic can
consistently find the same entry across dependency sources.
OCI transport details are stored in dedicated fields on this model and
must not change the canonical lockfile key shape.
"""

Copilot uses AI. Check for mistakes.
Comment on lines +98 to +100
assert req.resolved_source_type == "git"
assert req.resolved_repository == "github"
assert req.resolved_ref.endswith("microsoft/apm-standards.git#v1.2.0")
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test asserts on a URL-like string using a substring check (endswith). Our test suite convention (and CodeQL) requires URL assertions to parse the URL with urllib.parse.urlparse and compare specific components (e.g., hostname/path) rather than substring matching, otherwise CI can fail with py/incomplete-url-substring-sanitization.

Copilot uses AI. Check for mistakes.
Comment on lines +123 to +136
with tarfile.open(archive_path, "r:gz") as tar:
for member in tar.getmembers():
if member.name.startswith("/") or ".." in Path(member.name).parts:
raise RuntimeError(
f"Refusing to extract unsafe archive entry: {member.name}"
)
if member.issym() or member.islnk():
raise RuntimeError(
f"Refusing to extract symlink/hardlink from OCI archive: {member.name}"
)
if sys.version_info >= (3, 12):
tar.extractall(target_path, filter="data")
else:
tar.extractall(target_path) # noqa: S202
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tar archive traversal protection misses backslash-based paths (e.g. "..\..\x"): Path(member.name).parts will treat backslashes as normal characters on POSIX, so the '..' check won't fire. Normalize member.name by replacing '\' with '/' (or use PurePosixPath) before validating parts, and ensure the final extracted path is contained within target_path (e.g., with ensure_path_within) before extracting.

Copilot uses AI. Check for mistakes.
Comment thread src/apm_cli/commands/install.py Outdated
Comment on lines +1401 to +1409
try:
ref = DependencyReference.parse(p)
ref = PackageRequirement.parse(p)
only_identities.add(ref.get_identity())
except Exception:
only_identities.add(p)
try:
ref = DependencyReference.parse(p)
only_identities.add(ref.get_identity())
except Exception:
only_identities.add(p)
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--only parsing tries PackageRequirement.parse() first. PackageRequirement.parse() currently accepts host-qualified strings like "gitlab.com/group/repo", so those will be treated as logical requirements instead of DependencyReference, changing identity matching and potentially selecting the wrong package set. Use the same heuristic as manifest parsing (skip requirements when the first path segment contains a '.') or tighten PackageRequirement.parse() to reject host-qualified inputs so the fallback to DependencyReference.parse() works.

Copilot uses AI. Check for mistakes.
Comment thread src/apm_cli/commands/install.py Outdated
Comment on lines +1146 to +1167
def _matches_requested_package(dep_ref, only_packages) -> bool:
"""Return True when *dep_ref* matches one of the requested package tokens."""
if not only_packages:
return True

tokens = {
dep_ref.get_unique_key(),
dep_ref.get_display_name(),
str(dep_ref),
}
package = getattr(dep_ref, "package", None)
if package:
tokens.add(package)
tokens.add(package.split("/")[-1])
repo_url = getattr(dep_ref, "repo_url", None)
if repo_url:
tokens.add(repo_url)
tokens.add(repo_url.split("/")[-1])

return any(pkg in tokens for pkg in only_packages)


Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_matches_requested_package() is newly introduced but appears unused within install.py (no call sites found). If it's not needed yet, removing it avoids dead code; if it is intended for --only filtering, consider wiring it into the dependency selection logic and adding a focused unit test around the matching behavior.

Suggested change
def _matches_requested_package(dep_ref, only_packages) -> bool:
"""Return True when *dep_ref* matches one of the requested package tokens."""
if not only_packages:
return True
tokens = {
dep_ref.get_unique_key(),
dep_ref.get_display_name(),
str(dep_ref),
}
package = getattr(dep_ref, "package", None)
if package:
tokens.add(package)
tokens.add(package.split("/")[-1])
repo_url = getattr(dep_ref, "repo_url", None)
if repo_url:
tokens.add(repo_url)
tokens.add(repo_url.split("/")[-1])
return any(pkg in tokens for pkg in only_packages)

Copilot uses AI. Check for mistakes.
# Conflicts:
#	src/apm_cli/commands/install.py
#	src/apm_cli/deps/lockfile.py
#	src/apm_cli/models/apm_package.py
#	src/apm_cli/models/dependency/__init__.py
@vlsi vlsi force-pushed the feat/repository-driven-oci-resolution branch from cb9db25 to 4d37555 Compare April 24, 2026 11:48
@danielmeppiel danielmeppiel added panel-review Trigger the apm-review-panel gh-aw workflow and removed panel-review Trigger the apm-review-panel gh-aw workflow labels Apr 24, 2026
@github-actions
Copy link
Copy Markdown

Python Architect -- PR #900 Review

Architectural Verdict

This PR introduces a well-factored new resolution axis (repository-driven OCI + git) with clean value objects (PackageRequirement, RepositoryDefinition, ResolvedArtifact) and a proper Chain of Responsibility resolver. However, the PR has two blocking issues: (1) the _should_parse_as_requirement() heuristic silently steals three-segment shorthand refs (Azure DevOps org/project/repo) and virtual package paths (user/repo/collections/test) away from DependencyReference, introducing a backwards-incompatible parsing regression for existing users; and (2) the lockfile from_dependency_ref fast-path maps semantically wrong fields into OCI columns (oci_registry receives a logical repository name like "ghcr" instead of a registry hostname, oci_repository receives the logical name instead of the OCI path). Both must be fixed before merge.


1. OO / class diagram (mermaid)

classDiagram
    direction LR

    class DependencyReference {
        <<ValueObject>>
        +repo_url str
        +host str
        +reference str
        +is_virtual bool
        +virtual_path str
        +parse(raw) DependencyReference
        +get_install_path(dir) Path
        +get_unique_key() str
    }

    class PackageRequirement {
        <<ValueObject>>
        +name str
        +version str
        +repository str
        +alias str
        +resolved_source_type str
        +resolved_repository str
        +resolved_ref str
        +resolved_digest str
        +dependency_kind str
        +repo_url property
        +reference property
        +parse(raw) PackageRequirement
        +get_install_path(dir) Path
        +get_unique_key() str
    }

    class RepositoryDefinition {
        <<ValueObject / frozen>>
        +name str
        +type str
        +base str
        +priority int
    }

    class ResolvedArtifact {
        <<ValueObject / frozen>>
        +repository_name str
        +source_type str
        +locator str
        +host str
    }

    class ArtifactResolver {
        <<Chain of Responsibility>>
        +git_downloader GitHubPackageDownloader
        +oci_client OCIRegistryClient
        +fetch_requirement(req, path)
        -_resolve_artifact(req, repo) ResolvedArtifact
        -_fetch_artifact(req, artifact, path)
    }

    class OCIRegistryClient {
        <<Adapter>>
        +pull_package(ref, path, req) PackageInfo
        -_pull_artifact(ref, dir)
        -_locate_package_archive(dir) Path
        -_extract_package_archive(arc, path) Path
    }

    class GitHubPackageDownloader {
        +download_package(dep_ref, path) PackageInfo
    }

    class LockedDependency {
        <<ValueObject>>
        +repo_url str
        +source_type str
        +repository_name str
        +oci_registry str
        +oci_repository str
        +oci_tag str
        +oci_digest str
        +from_dependency_ref(dep_ref, ...) LockedDependency
        +to_dependency_ref() DependencyReference
    }

    class APMPackage {
        +dependencies dict
        +from_apm_yml(path) APMPackage
        +get_apm_dependencies() list
    }

    ArtifactResolver *-- GitHubPackageDownloader : delegates git
    ArtifactResolver *-- OCIRegistryClient : delegates oci
    ArtifactResolver ..> RepositoryDefinition : iterates by priority
    ArtifactResolver ..> ResolvedArtifact : produces
    ArtifactResolver ..> PackageRequirement : reads and mutates
    OCIRegistryClient ..> PackageInfo : returns
    APMPackage ..> PackageRequirement : parses into
    APMPackage ..> DependencyReference : parses into
    LockedDependency ..> PackageRequirement : consumes via from_dependency_ref
    LockedDependency ..> DependencyReference : consumes and produces

    note for ArtifactResolver "Chain of Responsibility:\nrepo1(priority 100) -> repo2(90) -> repo3(80)\nFirst successful fetch wins."
    note for PackageRequirement "Mutable: ArtifactResolver sets resolved_* fields in-place"
    note for OCIRegistryClient "Adapter: wraps subprocess oras CLI"

    class PackageRequirement:::touched
    class RepositoryDefinition:::touched
    class ResolvedArtifact:::touched
    class ArtifactResolver:::touched
    class OCIRegistryClient:::touched
    class LockedDependency:::touched
    class APMPackage:::touched
    classDef touched fill:#fff3b0,stroke:#d47600
Loading

2. Execution flow diagram (mermaid)

flowchart TD
    A["User: apm install<br/>(manifest has 'microsoft/apm-standards#v1.2.0')"]
    B["APMPackage.from_apm_yml()<br/>src/apm_cli/models/apm_package.py:131"]
    C{"_should_parse_as_requirement()<br/>src/apm_cli/models/apm_package.py:317"}
    D["PackageRequirement.parse(raw)<br/>src/apm_cli/models/dependency/requirement.py:48"]
    E["DependencyReference.parse(raw)<br/>src/apm_cli/models/dependency/reference.py:915"]
    F["--- WIRING GAP ---<br/>install pipeline does NOT call ArtifactResolver<br/>src/apm_cli/install/ has no PackageRequirement handling"]
    G["ArtifactResolver.fetch_requirement(req, path)<br/>src/apm_cli/repositories/resolver.py:36"]
    H["load_repositories()<br/>[I/O] reads ~/.apm/repositories.yml<br/>src/apm_cli/repositories/config.py:34"]
    I{"repository.type?"}
    J["_resolve_artifact() builds git locator<br/>e.g. https://github.com/microsoft/apm-standards.git#v1.2.0"]
    K["_resolve_artifact() builds OCI locator<br/>e.g. ghcr.io/apm/acme/security-pack:1.0.0"]
    L["[NET] git_downloader.download_package(dep_ref, path)<br/>src/apm_cli/deps/github_downloader.py:179"]
    M["[EXEC][NET] oci_client._pull_artifact(ref, dir)<br/>subprocess: oras pull<br/>src/apm_cli/deps/oci_registry.py:91"]
    N["[FS] _extract_package_archive(arc, path)<br/>tarfile.extractall<br/>src/apm_cli/deps/oci_registry.py:121"]
    O["validate_apm_package(root)<br/>src/apm_cli/models/validation.py:251"]
    P["requirement.resolved_source_type = 'git' or 'oci'<br/>(in-place mutation)<br/>src/apm_cli/repositories/resolver.py:98-109"]
    Q["LockedDependency.from_dependency_ref(req, ...)<br/>fast-path: dependency_kind == 'package_requirement'<br/>src/apm_cli/deps/lockfile.py:208"]
    R["[FS] lockfile.save()<br/>writes apm.lock.yaml"]

    A --> B
    B --> C
    C -- "no '://', no 'git@', no '//',<br/>no '.' in first segment" --> D
    C -- "otherwise" --> E
    D --> F
    E --> F
    F -. "prototype: not wired yet" .-> G
    G --> H
    H --> I
    I -- "git" --> J
    I -- "oci" --> K
    J --> L
    K --> M
    M --> N
    N --> O
    L --> P
    O --> P
    P --> Q
    Q --> R

    style F fill:#ffcccc,stroke:#cc0000
Loading

3. Design patterns

Design patterns

  • Used in this PR: Chain of Responsibility -- ArtifactResolver.fetch_requirement() iterates RepositoryDefinition list sorted by priority (resolver.py:39-47), first successful fetch wins, annotated in the class diagram.
  • Used in this PR: Adapter -- OCIRegistryClient wraps the external oras subprocess behind a Python method interface (oci_registry.py:34), annotated in the class diagram.
  • Used in this PR: Dataclass-as-value-object -- RepositoryDefinition (frozen=True), ResolvedArtifact (frozen=True), OCIPullResult (frozen=True) are immutable value objects, annotated in the class diagram.
  • Pragmatic suggestion: Protocol or ABC -- PackageRequirement and DependencyReference are structurally compatible (both have repo_url, reference, get_install_path, get_unique_key) but share no formal contract. A DependencySpec Protocol with those four members would let the lockfile, uninstall engine, and install pipeline accept Union[DependencyReference, PackageRequirement] without getattr/dependency_kind string-tag dispatch, eliminating the duck-typing fragility throughout lockfile.py:208-224 and engine.py:167.

4. Specific Issues

BLOCKING

# File:Line Severity Description
1 src/apm_cli/models/apm_package.py:317-336 (_should_parse_as_requirement) Critical / regression The heuristic only checks for "." in the first path segment to distinguish host-qualified refs. This means all bare 3+ segment shorthand refs are now silently parsed as PackageRequirement instead of DependencyReference. Specifically: (a) Azure DevOps shorthand org/project/repo -- currently parsed by DependencyReference which extracts ado_organization, ado_project, and builds the ADO URL -- will now become a PackageRequirement with name="org/project/repo" and no ADO metadata; (b) Virtual package paths like user/repo/collections/test or user/repo/subdir -- currently parsed as is_virtual=True with virtual_path set -- will become a flat PackageRequirement. The // guard only checks value.startswith("//"), not "//" in value, so user/repo//path/to/file also passes through. Fix: the heuristic must count path segments and reject 3+ segments, check for "//" in value (not just startswith), and consult is_supported_git_host() for the default host's ADO check.
2 src/apm_cli/deps/lockfile.py:220-222 (from_dependency_ref fast-path) High / semantic error oci_registry is set to dep_ref.resolved_repository which is the logical repository name (e.g. "ghcr") -- not the OCI registry hostname (ghcr.io). oci_repository is set to dep_ref.repo_url which is the logical package name ("acme/security-pack") -- not the full OCI repository path (ghcr.io/apm/acme/security-pack). The lockfile comment on line 29 says oci_registry is "Registry alias used in apm.yml/config" but the comment on line 30 says oci_repository is "OCI repository path, e.g. acme/security-pack". The test on line 64-67 of test_repository_resolution.py asserts round_trip.oci_repository == "acme/security-pack" and round_trip.oci_registry == "ghcr" -- meaning the test is wrong along with the code (it tests the bug, not the intent). Either the lockfile field semantics must be precisely defined (document that oci_registry is a logical alias, not a hostname), or the mapping must be fixed to store the actual registry hostname from the ResolvedArtifact.locator.
3 src/apm_cli/repositories/ (directory) High / import failure risk No __init__.py exists. The import from ..repositories.config import ... in resolver.py:12 works today only because the project uses namespace packages or --find-links / sys.path manipulation. If the project ever sets packages = find: with where = src in pyproject.toml (standard setuptools), or if any tool scans for explicit packages, this directory will be invisible. Add __init__.py with public exports.

NON-BLOCKING (should fix before GA, acceptable in prototype)

# File:Line Severity Description
4 src/apm_cli/repositories/resolver.py:98 (_fetch_artifact mutates PackageRequirement) Medium / design smell PackageRequirement is a non-frozen dataclass. _fetch_artifact mutates it in-place (sets resolved_source_type, resolved_repository, resolved_ref, resolved_host, resolved_digest). This couples resolution side-effects to the model object and would be unsafe in any parallel resolution scenario. The ResolvedArtifact frozen dataclass is already produced -- it should carry all resolved metadata downstream instead of backpatching the requirement. Alternatively, freeze PackageRequirement and return a ResolvedPackage composite.
5 src/apm_cli/deps/oci_registry.py:98-100 Medium RuntimeError is raised for missing oras CLI. This should be a dedicated MissingToolError or at minimum a descriptive subclass, so callers can distinguish "tool not installed" from "pull failed" without string-matching the message. The existing codebase uses domain-specific errors (e.g. PolicyViolationError, PathTraversalError).
6 src/apm_cli/deps/oci_registry.py:79 Medium / incorrect typing ref_type=GitReferenceType.BRANCH is hardcoded for OCI artifacts. OCI artifacts have no git reference type -- this is a semantic lie in the ResolvedReference. Either add an OCI variant to the enum or use None.
7 src/apm_cli/deps/lockfile.py:250-264 (to_dependency_ref) Medium / incomplete round-trip to_dependency_ref() always returns a DependencyReference even for OCI-sourced locked deps (source_type == "oci"). This means any code path that calls locked.to_dependency_ref().get_install_path() on an OCI dep will compute a git-style install path from repo_url (which is the logical name), and the ADO/virtual/host logic in DependencyReference will misinterpret it. The lockfile's get_installed_package_paths() (line 436) calls this unconditionally.
8 src/apm_cli/commands/uninstall/engine.py:78-79 Low / defensive but wrong if not hasattr(dep_ref, "get_install_path"): continue -- PackageRequirement does have get_install_path, so this guard is dead code today. But it reveals the real concern: the uninstall engine does not know how to locate OCI packages on disk. If the intent is to skip OCI deps, an explicit isinstance or source_type check would be clearer and safer than duck-typing on method existence.
9 src/apm_cli/commands/uninstall/engine.py:167 Low getattr(ref, "repo_url", getattr(ref, "package", str(pkg))) -- PackageRequirement has no package attribute. The inner fallback getattr(ref, "package", ...) is dead and misleading. Should be getattr(ref, "repo_url", str(pkg)).
10 src/apm_cli/repositories/resolver.py:9 Low / naming GitHubPackageDownloader is now used for any git host (GitHub, GitLab, etc.) via repository config. The class name implies GitHub-only. Not blocking, but a rename to GitPackageDownloader (or an alias) would prevent confusion as this grows.
11 src/apm_cli/repositories/config.py:42 Low / silent failure load_repositories() catches all exceptions from YAML parsing and silently falls back to defaults. A malformed ~/.apm/repositories.yml will be silently ignored. At minimum, log a warning so users know their config is not being used.
12 Install pipeline wiring gap Info / prototype scope ArtifactResolver is defined but never called from the install flow (src/apm_cli/install/ has zero references to PackageRequirement or ArtifactResolver). The manifest parser produces PackageRequirement objects, but the install phases only handle DependencyReference. This means apm install on a manifest with logical requirements will fail at the download step. Acceptable for a prototype PR if documented, but the wiring must happen before any user can exercise this path.

Generated by PR Review Panel for issue #900 · ● 29.6M ·

@github-actions
Copy link
Copy Markdown

CLI Logging / Output UX Review

Reviewed from the perspective of: Does the new code give humans (default) and agents (--verbose) the right signals at the right time?

Overall the prototype is well-structured, but its output story has significant gaps. The three new library modules (oci_registry.py, resolver.py, config.py) operate in complete silence -- no CommandLogger integration, no verbose callbacks, no warnings on partial failures. This is at odds with the established APM pattern where every network call, every fallback decision, and every skipped item is observable.


[F1] MEDIUM -- OCI pull is a black hole for output

src/apm_cli/deps/oci_registry.py:88-104

_pull_artifact() runs subprocess.run(["oras", "pull", ...]) with capture_output=True but:

  • On success: nothing is logged. The user sees no progress indicator for what could be a multi-second network download.
  • On failure: proc.stderr is stuffed into a RuntimeError string, but never surfaced through CommandLogger. In --verbose mode, an agent debugging auth or registry issues cannot see what ORAS actually said.
  • The class accepts no logger or verbose_callback parameter.

Compare with the established pattern in GitHubPackageDownloader.download_package() which accepts progress_task_id, progress_obj, and verbose_callback parameters for exactly this purpose. OCI pulls deserve the same treatment.

Suggest: Accept an optional verbose_callback (or a CommandLogger reference). Log the ORAS command args at verbose level before the call, and stderr/stdout at verbose level after. On failure, still raise, but emit a logger.verbose_detail() with the raw ORAS output first.


[F2] MEDIUM -- Resolver swallows per-repository failures silently

src/apm_cli/repositories/resolver.py:42-47

try:
    artifact = self._resolve_artifact(requirement, repository)
    return self._fetch_artifact(requirement, artifact, target_path)
except Exception as exc:
    last_error = exc
    continue
```

When a requirement is tried against multiple repositories, each failed attempt is silently discarded. Only the **last** error survives in the final `RuntimeError`. This violates the progressive-disclosure rule -- in `--verbose` mode, the user (or agent) should see:

```
  [i] Trying repository 'github' for owner/repo...
  [!] github: connection refused
  [i] Trying repository 'ghcr' for owner/repo...

Without this, debugging "why did resolution fail?" requires guesswork.

Suggest: Accept an optional logger: CommandLogger in the constructor. Log each attempt at verbose_detail level, and each per-repo failure at warning level.


[F3] HIGH -- Silent config fallback hides broken user config

src/apm_cli/repositories/config.py:40-43

try:
    raw = yaml.safe_load(path.read_text(encoding="utf-8")) or {}
except Exception:
    return list(DEFAULT_REPOSITORIES)

If a user writes ~/.apm/repositories.yml with a YAML syntax error, or invalid entries (lines 51-62 also silently skip malformed entries), they get the default GitHub/GitLab/GHCR repositories with zero indication that their config was ignored. This is a "So What?" failure -- the user can act on it (fix their YAML), but never learns they need to.

Similarly, lines 46-47 silently fall back if repositories key is not a list, and lines 57-62 silently skip entries with missing/invalid fields. All of these are user errors that deserve a warning.

Suggest: This is a pure-data loader, so a full CommandLogger dependency is heavy. Instead, consider either:

  • Returning a tuple (repos, warnings: list[str]) so the caller can route warnings through its logger, or
  • Accepting an optional warn_fn: Callable[[str], None] parameter (the plugin_parser.py dual-channel pattern at src/apm_cli/deps/plugin_parser.py:28-42 is a good precedent).

[F4] LOW -- "Removed from manifest (OCI dependency)" is misleading

src/apm_cli/commands/uninstall/engine.py:128

if not hasattr(dep_ref, "get_install_path"):
    logger.progress(f"Removed {package} from manifest (OCI dependency)")
    removed += 1
    continue

Two issues:

  1. The message claims an action that hasn't happened here. At this point in the flow, _remove_packages_from_disk() is only handling disk cleanup. The actual manifest (apm.yml) write happens earlier in cli.py:112-116. What this code really means is "no disk path to clean up for this OCI dep -- skipping disk removal." The current message misleads the user into thinking the manifest was just modified.

  2. Symbol mismatch. progress() defaults to symbol="info" ([i]), but the message describes a completed action. Nearby line 146 has the same pattern, so this is pre-existing, but the new OCI path is a good place to fix it.

Suggest: Reword to something like logger.progress(f"Skipped disk cleanup for {package} (OCI-only, no local files)", symbol="check") or, if the intent is to confirm removal, use logger.success() to match the green/completed semantic that download_complete uses in the install flow.


[F5] INFO -- No install-pipeline integration means no progress story yet

src/apm_cli/repositories/resolver.py (general)

ArtifactResolver is defined but not wired into the install command (cli.py, install phases). This means there is no place today where OCI resolution progress would be shown to the user. That's fine for a prototype, but worth flagging: when this gets wired in, it will need InstallLogger integration (at minimum download_start / download_complete / download_failed calls around the OCI fetch path, matching the existing git download flow).

This is not a blocker -- just a reminder that the logging contract is incomplete until the install integration lands.


Summary

ID Severity File Issue
F1 MEDIUM oci_registry.py:88-104 OCI pull has no progress/verbose output
F2 MEDIUM resolver.py:42-47 Per-repo failures silently swallowed
F3 HIGH config.py:40-47, 51-62 Broken user config silently ignored
F4 LOW engine.py:128 Misleading "Removed from manifest" message
F5 INFO resolver.py (general) Install pipeline not wired -- logging contract incomplete

F3 is the highest-priority fix because it violates a core principle: if the user wrote a config file and it is being ignored, they must be told. The rest are prototype gaps that should be resolved before the feature exits prototype status.

Generated by PR Review Panel for issue #900 · ● 29.6M ·

@github-actions
Copy link
Copy Markdown

Supply Chain Security -- Threat Model Assessment

PR #900 introduces a repository-driven resolution layer with OCI transport support. This review evaluates the new attack surfaces against APM's documented security model (docs/src/content/docs/enterprise/security.md) and lockfile specification (docs/src/content/docs/reference/lockfile-spec.md).

Verdict: HOLD -- 2 critical, 3 high, 3 medium findings must be resolved before merge.


1. Tarball Extraction Safety (tarslip, symlink, hardlink)

Severity: HIGH
File: src/apm_cli/deps/oci_registry.py:120-136

The pre-validation loop at lines 124-132 checks for absolute paths, .. components, and symlink/hardlink entries. This is good. However, there is a TOCTOU (time-of-check-time-of-use) gap on Python < 3.12:

# Lines 124-136
for member in tar.getmembers():       # CHECK
    if member.name.startswith("/") or ".." in Path(member.name).parts:
        raise RuntimeError(...)
    if member.issym() or member.islnk():
        raise RuntimeError(...)
if sys.version_info >= (3, 12):
    tar.extractall(target_path, filter="data")  # SAFE
else:
    tar.extractall(target_path)  # USE -- no filter, no members= restriction

Issue A -- Missing members= parameter. On Python < 3.12, tar.extractall() is called without passing the validated members list. A crafted tarball could contain a member whose name property returns a safe value on first read (getmembers()) but whose underlying header stores a different path. This is an unlikely but documented class of tar attacks. The fix is to pass members=tar.getmembers() (the already-validated list) to extractall() on both branches, or better, build a filtered list:

safe_members = []
for member in tar.getmembers():
    # ... validate ...
    safe_members.append(member)
tar.extractall(target_path, members=safe_members)

Issue B -- ad-hoc .. check instead of using path_security. The check ".." in Path(member.name).parts is functionally correct for this case, but APM's security contract requires path validation to go through validate_path_segments (see src/apm_cli/utils/path_security.py). Using the ad-hoc check here creates a second code path that must be maintained in sync. It also does not use ensure_path_within to verify the resolved extraction target stays inside target_path.

Issue C -- No size/count limits. A zip-bomb-style tarball with many entries or enormous files would exhaust disk. Consider adding max-entry-count and max-total-bytes guards.

Recommendation:

  1. Pass validated safe_members to extractall() on both code paths.
  2. After extraction, call ensure_path_within(extracted_file, target_path) on each extracted file to catch any bypass.
  3. Use validate_path_segments(member.name, context="archive entry") instead of the ad-hoc check.
  4. Add size/count guards (e.g., 10000 files, 500 MB total).

2. Locator Injection via name / version / base

Severity: MEDIUM
File: src/apm_cli/repositories/resolver.py:62-84

Locators are built by string interpolation:

# git:
locator = f"{repository.base.rstrip('/')}/{requirement.name}.git"
# oci:
locator = f"{repository.base.rstrip('/')}/{requirement.name}:{requirement.version}"

2a -- name is validated. PackageRequirement.parse() calls validate_path_segments(name, context="package name") at line 60 of requirement.py. This rejects .. and . segments. The from_dict() path also validates at line 69. Good.

2b -- version is NOT validated for path-safety characters. The version field is only checked for "non-empty string" (requirement.py:72-75). It is then interpolated directly into an OCI reference at resolver.py:79:

locator = f".../{requirement.name}:{requirement.version}"

An OCI tag is syntactically constrained by the OCI spec to [a-zA-Z0-9_.-]{1,128}, but APM does not enforce this. A malicious apm.yml could set version: "latest --config /etc/passwd" or include shell metacharacters. While subprocess.run() with a list (not shell=True) prevents shell injection (see finding 7), the unsanitized version string flows into the oras CLI as a positional argument and could be interpreted by oras itself as flags if it starts with --.

Recommendation: Add an OCI-tag-format validator on the version field when the resolved source type is OCI. At minimum, reject whitespace and -- prefixes. Ideally enforce ^[a-zA-Z0-9_.-]{1,128}$.

2c -- base comes from user config and is not validated (see finding 3).


3. User-Controlled Repository Config as Attack Surface

Severity: MEDIUM
File: src/apm_cli/repositories/config.py:34-77

~/.apm/repositories.yml is loaded with yaml.safe_load (safe against deserialization attacks). The type field is restricted to ("git", "oci"). However:

3a -- No URL scheme validation on base. The base field is accepted as any non-empty string (line 62-63). A user (or a process writing to ~/.apm/) could set:

  • base: "(redacted) -- causes oras pull (redacted) or `git clone (redacted)
  • base: "(evil.example.com/redacted) -- unencrypted transport, bypasses APM's --allow-insecuregate entirely since that gate only applies toDependencyReference, not PackageRequirement` resolved through repositories
  • `base: "(redacted) or other exotic schemes

3b -- Config file is user-scoped (~/.apm/), not project-scoped. This means it cannot be audited in version control. An attacker with write access to the user's home directory (e.g., a compromised VS Code extension) could redirect all APM installs to a hostile OCI registry. This is particularly concerning because the config is silently loaded with no user-visible indication of which repositories are active.

3c -- Priority ordering is user-controlled. A hostile config could set a malicious OCI registry to priority 999, overriding the built-in GitHub source. Combined with the fallback behavior in resolver.py:36-47 (try each repo in priority order, continue on failure), this makes dependency confusion trivial if the config is compromised.

Recommendation:

  1. Validate base against an allowlist of schemes: https:// only by default; `(redacted) only with an explicit insecure flag.
  2. Log a visible warning when ~/.apm/repositories.yml is loaded, showing the active repositories and their priorities.
  3. Consider adding apm config repositories list to let users inspect the active config.
  4. Document this attack surface in enterprise/security.md.

4. OCI Auth Gap and Credential Management

Severity: CRITICAL
File: src/apm_cli/deps/oci_registry.py:88-104, src/apm_cli/repositories/resolver.py:104-111

The OCI path completely bypasses APM's AuthResolver (src/apm_cli/core/auth.py). The oras CLI is invoked bare, with no credential injection:

proc = subprocess.run(
    ["oras", "pull", resolved_reference, "--output", str(output_dir)],
    capture_output=True, text=True, check=False,
)

This means:

  • No GITHUB_APM_PAT / GITHUB_TOKEN integration. Users cannot authenticate to private OCI registries through APM's credential management.
  • oras uses its own credential store (Docker config.json, credential helpers), which APM has no visibility into or control over. This violates the security model's guarantee that "Tokens are never embedded in URLs" and "Scoped to their git host" since oras may send credentials to any registry.
  • No per-host scoping. If oras has a wildcard credential helper, a malicious base URL could exfiltrate credentials.
  • No --plain-http protection. oras defaults to HTTPS but the base URL could force HTTP.

This is the most significant security gap in this PR. The security model (enterprise/security.md) states: "APM authenticates to git hosts using personal access tokens (PATs) read from environment variables." The OCI path silently violates this contract.

Recommendation:

  1. Either integrate with AuthResolver to inject OCI registry credentials (e.g., oras pull --username ... --password ... or via ORAS_CONFIG), or...
  2. Document clearly in enterprise/security.md that OCI transport uses external credential management, and what the implications are.
  3. At minimum, refuse (redacted) base URLs for OCI repos unless --allow-insecure` is passed.
  4. Consider adding --insecure flag awareness to the oras invocation when the base is HTTP.

5. Content Integrity for OCI Packages

Severity: CRITICAL
File: src/apm_cli/deps/oci_registry.py:26-86, src/apm_cli/deps/lockfile.py:208-224

5a -- No content_hash computation for OCI packages. For git packages, compute_package_hash() is called in all three DependencySource.acquire() paths (sources.py:130, 254, 404). The OCI path in OCIRegistryClient.pull_package() returns a PackageInfo but never computes or attaches a content_hash. The lockfile entry for OCI dependencies will have content_hash: null.

This means subsequent installs cannot verify cached OCI packages against the lockfile. A tampered cache will be silently accepted. This directly violates the lockfile spec section 4.4: "On subsequent installs, cached packages are verified against this hash."

5b -- oci_digest is never populated. The OCIPullResult.resolved_digest field exists (line 31) but is always None. The code at resolver.py:109 tries getattr(package_info, "resolved_digest", None) but package_info is a PackageInfo, not an OCIPullResult. The digest is lost.

Even if it were populated, there is no code path that verifies oci_digest on subsequent installs. The field is stored in the lockfile but never checked.

5c -- No resolved_commit equivalent. Git packages are pinned to a 40-character commit SHA. OCI packages have no equivalent immutable identifier recorded. The resolved_ref for OCI is the mutable tag reference (ghcr.io/apm/acme/pack:1.0.0). A registry operator could replace the artifact behind this tag.

The lockfile spec's reproducibility guarantee -- "the same lock file yields the same deployed files on every machine, every time" -- is not met for OCI packages.

Recommendation:

  1. Compute content_hash after OCI extraction, same as the git path.
  2. Capture the OCI digest from oras output (oras prints it) and store it in the lockfile.
  3. On re-install, verify both content_hash (on cached packages) and oci_digest (on fresh pulls) against the lockfile.
  4. Update the lockfile spec to document OCI integrity semantics.
  5. Consider requiring oci_digest to be present (fail-closed) once it has been recorded for a dependency.

6. Dependency Confusion via Multi-Backend Resolution

Severity: HIGH
File: src/apm_cli/repositories/config.py:22-26, src/apm_cli/repositories/resolver.py:36-56

The default repositories are:

DEFAULT_REPOSITORIES = [
    RepositoryDefinition(name="github", type="git", base="https://github.com", priority=100),
    RepositoryDefinition(name="gitlab", type="git", base="(gitlab.com/redacted)", priority=90),
    RepositoryDefinition(name="ghcr", type="oci", base="ghcr.io/apm", priority=80),
]

The resolver tries each repository in priority order and falls back on failure (resolver.py:39-47):

for repository in load_repositories():
    try:
        artifact = self._resolve_artifact(requirement, repository)
        return self._fetch_artifact(requirement, artifact, target_path)
    except Exception as exc:
        last_error = exc
        continue  # try next repo

6a -- Silent fallback enables dependency confusion. If microsoft/apm-standards does not exist on GitHub (perhaps due to a network error or transient 404), the resolver falls back to GitLab, then to ghcr.io. An attacker who publishes ghcr.io/apm/microsoft/apm-standards:v1.0.0 on the public GHCR would get their package installed whenever GitHub is unavailable. The user sees no warning that the package came from a lower-priority source.

6b -- The ghcr default namespace ghcr.io/apm is not organization-controlled. Unless ghcr.io/apm is a GitHub organization/user controlled by the APM project, anyone could publish packages under this path on the public GHCR. This is a direct dependency confusion vector.

6c -- No repository pinning in the lockfile. Once a package is resolved, the repository_name is recorded in the lockfile (lockfile.py:213), but on subsequent installs, there is no code that enforces "this package MUST come from the same repository." A re-resolution could silently switch backends.

Recommendation:

  1. When a package is locked to a specific repository_name, enforce that re-resolution uses the same repository. Fail if it is unavailable rather than falling back.
  2. Log a clear warning when a package is resolved from a repository other than the highest-priority one.
  3. Do NOT fall back to a different repository on transient errors -- fail closed. Reserve fallback for explicit --fallback opt-in.
  4. Verify that ghcr.io/apm is a controlled namespace, or remove it from defaults until it is.
  5. Consider adding a --repository flag to apm install for explicit pinning.

7. Subprocess Invocation Safety

Severity: LOW
File: src/apm_cli/deps/oci_registry.py:91-96

proc = subprocess.run(
    ["oras", "pull", resolved_reference, "--output", str(output_dir)],
    capture_output=True, text=True, check=False,
)

7a -- List-form subprocess is safe against shell injection. Using subprocess.run() with a list (not shell=True) prevents shell metacharacter interpretation. This is correct.

7b -- Argument injection via resolved_reference. Since resolved_reference is a single positional argument, a value like --config /etc/passwd would be interpreted by oras as a reference string, not as flags, because it follows the pull subcommand and an explicit positional. However, oras argument parsing may vary by version. Safer to add -- before the reference:

["oras", "pull", "--", resolved_reference, "--output", str(output_dir)]

7c -- No timeout. The subprocess has no timeout. A malicious or misconfigured registry could hang the connection indefinitely, causing apm install to never complete. Add timeout=300 (or a configurable value).

7d -- Error output could leak registry URLs or paths. The stderr is included in the RuntimeError at line 104:

raise RuntimeError(f"Failed to pull OCI artifact {resolved_reference}: {stderr}")

The stderr from oras could contain credential-related error messages (e.g., "401 Unauthorized for user X"). Review what oras emits and consider sanitizing.

Recommendation:

  1. Add -- before the reference argument.
  2. Add timeout=300 to the subprocess.run() call.
  3. Sanitize stderr before including in error messages.

8. Security Model Documentation Drift

Severity: HIGH
File: docs/src/content/docs/enterprise/security.md

The security model currently states:

"APM resolves dependencies directly from git repositories. There is no intermediary registry, proxy, or mirror."

And:

"Dependencies are full git URLs"

And in the attack surface table:

"No registry exists"

This PR introduces OCI registry support, which directly contradicts all three statements. The security model document MUST be updated in the same PR to reflect:

  • OCI registries as a new transport mechanism
  • The different integrity model for OCI (no commit SHA pinning)
  • The credential management delegation to oras
  • The multi-backend resolution and its dependency confusion implications

Per the rule in the supply-chain security expert charter: "If a code change weakens or contradicts any guarantee in enterprise/security.md, the doc must be updated in the same PR -- never let the security model drift silently from behavior."


Summary Table

# Finding Severity File:Line Status
1 Tar extraction TOCTOU + missing members= HIGH oci_registry.py:133-136 Must fix
2 version field not validated for OCI tag format MEDIUM requirement.py:72-75, resolver.py:79 Must fix
3 No URL scheme validation on repository base MEDIUM config.py:62-63 Must fix
4 OCI auth bypasses AuthResolver entirely CRITICAL oci_registry.py:91-96 Must fix or document
5 No content_hash or digest verification for OCI CRITICAL oci_registry.py:42-86, lockfile.py:208-224 Must fix
6 Silent fallback enables dependency confusion HIGH resolver.py:39-47, config.py:22-26 Must fix
7 subprocess missing timeout and -- separator LOW oci_registry.py:91-96 Should fix
8 Security model docs contradict new behavior HIGH enterprise/security.md Must fix

Blocking items for merge: Findings 1, 4, 5, 6, and 8 represent security regressions against APM's documented guarantees. Finding 5 (no content integrity for OCI) is particularly concerning because it silently breaks the lockfile reproducibility contract that enterprise users depend on.

I recognize this is labeled as a prototype. If the intent is to merge behind a feature flag or --experimental gate, the severity of these findings is reduced, but findings 1 (tarslip), 5 (integrity), and 8 (doc drift) remain blocking even for experimental code that touches the filesystem.

Generated by PR Review Panel for issue #900 · ● 29.6M ·

@github-actions
Copy link
Copy Markdown

Auth Expert Review -- PR #900: OCI Package Resolution

Scope: Authentication architecture, credential flow, token leakage, host classification.


Finding 1 -- MEDIUM: subprocess.run inherits full process environment, leaking APM tokens to oras

File: src/apm_cli/deps/oci_registry.py:91-96

The subprocess.run(["oras", "pull", ...]) call has no env= parameter. Python's default behavior is to inherit the parent process environment, which includes GITHUB_APM_PAT, GITHUB_TOKEN, GH_TOKEN, ADO_APM_PAT, and any other secrets the user has set.

Unlike the git transport path -- which carefully constructs a sanitized env dict via _setup_git_environment() and _build_noninteractive_git_env() (see github_downloader.py:218-254) -- the OCI path hands every environment variable to the oras subprocess. While oras itself does not consume GITHUB_TOKEN by default for auth, this pattern is fragile: future oras versions or plugins could read these variables. More importantly, it sets a precedent that contradicts the project's defense-in-depth approach.

Impact: No immediate credential theft, but violates the principle that subprocess credential exposure must be explicit and minimal. If oras or a credential helper ever reads GITHUB_TOKEN (as many tools do), tokens scoped for GitHub would be sent to arbitrary OCI registries configured in repositories.yml.

Recommendation: Build a sanitized env dict that strips all APM token variables before passing to oras. The git path's pattern in github_downloader.py:2088-2091 is the model:

# sanitize
env = {k: v for k, v in os.environ.items()
       if k not in _SENSITIVE_ENV_VARS}
proc = subprocess.run([...], env=env, ...)

Finding 2 -- HIGH: No mechanism to authenticate to private OCI registries via APM's token chain

File: src/apm_cli/deps/oci_registry.py (entire class), src/apm_cli/repositories/resolver.py:104-110

The OCIRegistryClient class contains zero references to env, token, auth, or credential (confirmed via grep). The ArtifactResolver does not import or reference AuthResolver. For the git path, credentials flow through GitHubPackageDownloader -> AuthResolver.resolve_for_dep() -> _build_repo_url() with explicit token injection. The OCI path has no equivalent.

User experience for private ghcr.io packages:

  1. User sets GITHUB_APM_PAT (the documented APM credential).
  2. apm install resolves a dependency to ghcr.io/apm/acme/private-pack:1.0.0.
  3. oras pull runs with no --username/--password flags and no explicit credential injection.
  4. If the user has NOT separately run oras login ghcr.io, the pull fails with a 401 or 403.
  5. The error message from line 103 is: "Failed to pull OCI artifact ghcr.io/...: <oras stderr>" -- no mention of GITHUB_APM_PAT, no build_error_context(), no actionable guidance.

This breaks the single-credential promise: APM documents one token (GITHUB_APM_PAT) for all GitHub-hosted assets, but OCI assets silently require a separate oras login step.

This is a feature gap, not a security regression -- no previously working auth path is broken. However, it is the highest-priority gap because it makes OCI unusable for the most common private registry scenario (ghcr.io authenticated via GitHub PAT).

Recommendation: OCIRegistryClient should accept an optional AuthResolver and, for ghcr.io hosts, resolve credentials via AuthResolver.resolve("ghcr.io", org=...) and pass them to oras via --username + --password flags or the ORAS_REGISTRY_CONFIG env var pointing to a temp Docker config.json. For ADO-backed OCI registries, use ADO_APM_PAT as the password with Basic auth.


Finding 3 -- MEDIUM: OCI ResolvedArtifact omits host, bypassing host classification

File: src/apm_cli/repositories/resolver.py:74-84

The git path extracts host from the locator (line 66) and populates ResolvedArtifact.host. The OCI path does not:

# git: host is extracted
host = locator.split("://", 1)[1].split("/", 1)[0] if "://" in locator else None

# oci: host is never set (defaults to None)
return ResolvedArtifact(..., source_type="oci", locator=locator)  # no host=
```

Downstream, `requirement.resolved_host` is only set for git artifacts (line 101), not OCI (lines 104-110). This means:
- OCI hosts are invisible to APM's auth chain and host classification.
- No `HostInfo` is ever created for OCI targets, so the `classify_host()` logic (github vs ghe_cloud vs ado vs generic) is never applied.
- When auth integration is added later, this missing host field will need to be retroactively populated.

**Recommendation:** Parse the host from OCI locators (e.g., `ghcr.io` from `ghcr.io/apm/foo:1.0.0`) and populate both `ResolvedArtifact.host` and `requirement.resolved_host` for OCI artifacts.

---

### Finding 4 -- LOW: No URL scheme validation on `repositories.yml` `base` field

**File:** `src/apm_cli/repositories/config.py:49-72`

The config parser validates `name`, `type`, and `base` for non-empty strings, but does not validate the URL scheme. For `type="git"`, the `base` is expected to be `(redacted)`. For `type="oci"`, the `base` is a bare registry host/path (e.g., `ghcr.io/apm`). There is no check that:
- Git `base` URLs use HTTPS (not HTTP, which would transmit tokens in cleartext).
- OCI `base` values don't accidentally include `https://` (which would produce malformed OCI references).

The existing git path has a separate `is_insecure` flag system (see `_build_repo_url` line 774), but the repository config layer does not enforce HTTPS-by-default for user-supplied base URLs.

**Recommendation:** Validate that git-type `base` URLs start with `https://` (or `(redacted) Reject `(redacted) unless an explicit `insecure: true` flag is set. Validate that OCI `base` values do NOT include a scheme prefix.

---

### Finding 5 -- LOW: Error messages for OCI failures lack auth guidance

**File:** `src/apm_cli/deps/oci_registry.py:102-104`

When `oras pull` fails, the error message is:
```
Failed to pull OCI artifact {resolved_reference}: {stderr}

Compare this to the git path, which calls AuthResolver.build_error_context() (used in 12+ places in github_downloader.py) to produce actionable messages like "set GITHUB_APM_PAT" or "run az login". The OCI path has no equivalent. Users who hit a 401 from ghcr.io will see raw oras stderr with no APM-specific remediation advice.

Recommendation: When oras exits non-zero and stderr contains auth-related keywords (401, 403, unauthorized, forbidden), wrap the error with OCI-specific auth guidance that references GITHUB_APM_PAT and oras login.


Finding 6 -- LOW: GitHubPackageDownloader naming is misleading for arbitrary git hosts

File: src/apm_cli/repositories/resolver.py:9, config.py:24

The default repositories include GitLab (base="(gitlab.com/redacted)"), and the git path in ArtifactResolver feeds arbitrary base URLs to GitHubPackageDownloader. The class already handles generic hosts correctly (line 781-782 of github_downloader.py: "Generic hosts: plain HTTPS, let git credential helpers handle auth"). The naming is confusing but not a functional issue. This is purely cosmetic for now and should not block the PR.


Summary

# Severity Finding
1 MEDIUM subprocess.run leaks full env (incl. APM tokens) to oras
2 HIGH No credential forwarding to oras for private OCI registries
3 MEDIUM OCI artifacts have no host field, invisible to host classification
4 LOW No URL scheme validation on repositories.yml base
5 LOW OCI error messages lack build_error_context() auth guidance
6 LOW GitHubPackageDownloader naming misleading for multi-host use

Verdict: Finding 2 is the critical feature gap -- private ghcr.io will silently fail. Finding 1 is the auth-safety concern that should be fixed before merge (env sanitization is a small change). Findings 3-5 are reasonable to track as follow-up issues if this is explicitly a prototype/RFC PR. Finding 6 is cosmetic.

This is NOT a security regression -- no existing credentials are sent where they should not go. But the env inheritance in Finding 1 creates a latent leakage vector that should be closed proactively, and Finding 2 means the OCI path is effectively public-only until auth integration is added.

Generated by PR Review Panel for issue #900 · ● 29.6M ·

@github-actions
Copy link
Copy Markdown

OSS Growth Assessment -- PR #900: Repository-Driven OCI Package Resolution

Reviewer: OSS Growth Hacker
Verdict: Ship with revisions (see below). High strategic value, but the prototype surfaces need guardrails to avoid conversion-funnel damage.


1. Growth/Adoption Implications of Shipping This Prototype

The upside is real. OCI registry support is the single most requested capability for enterprise adoption of any package manager. It answers the question every platform engineering team asks in the first 10 minutes of evaluation: "Can I host packages in our own registry?" Before this PR, the answer was "git repos only." After this PR, the answer is "yes, OCI registries too -- Harbor, Artifactory, GHCR, whatever you run." That is a conversion unlock for the enterprise segment.

The repositories.yml abstraction is architecturally sound and mirrors what made cargo's registry model successful: package identity stays stable while transport is configurable per-environment. This is the kind of design that compounds -- it makes future features (mirroring, caching, multi-region) possible without manifest changes. Good for the "portable by manifest" pillar.

However, the prototype ships documentation for features that do not work end-to-end. The ArtifactResolver class exists but is never called from cli.py, install.py, or any install phase. A user who reads the dependencies guide, adds a name: / repository: / version: block to their apm.yml, and runs apm install will hit a failure that is not explained anywhere. This is a silent conversion killer -- the docs promise something the tool cannot deliver, and the user blames themselves before they blame the tool.

Specific concern: The quickstart example apm install microsoft/apm-sample-package#v1.0.0 still works (backward compatible -- _should_parse_as_requirement returns True, but it has a #v1.0.0 pin and the existing install pipeline will handle it through the git path since PackageRequirement.repo_url returns the name and existing code treats it as a GitHub shorthand). But any user who follows the new dependency guide examples will hit a wall. That guide-to-failure gap is the growth risk.

Recommendation: Either (a) gate the OCI/logical-requirement docs behind an :::caution admonition that says "Requires apm experimental enable oci-resolution" (leveraging the existing experimental feature-flag system from v0.9.1), or (b) wire ArtifactResolver into the install pipeline in this PR so the docs are truthful. Option (a) is faster and protects the funnel; option (b) is cleaner but enlarges the PR scope.


2. Disambiguation / Mental Model Change Risk

The parsing change is the most growth-sensitive part of this PR. Before this PR, microsoft/apm-sample-package#v1.0.0 was parsed as a DependencyReference (direct GitHub shorthand). After this PR, _should_parse_as_requirement() classifies it as a PackageRequirement (logical requirement resolved through repositories).

For the default case, this is invisible -- the PackageRequirement has a .repo_url compatibility alias that returns self.name, so downstream code that reads dep.repo_url still gets microsoft/apm-sample-package. The default github repository at priority 100 means it resolves to the same GitHub URL. Users will not notice.

But the mental model shift matters for docs and messaging. The term "logical requirement" is new jargon. The dependencies guide now distinguishes "logical requirement strings" from "direct source strings" -- and the distinguishing heuristic (does the first segment contain a dot?) is not intuitive. acme/security-pack is logical; gitlab.com/acme/repo is direct. A user who sees both forms in the same apm.yml will wonder why they look different and which one they should use.

Recommendation: The docs should lead with the simple case first: "Most packages are just owner/repo." Then introduce the object form as "when you need to pin to a specific registry." The current docs lead with the advanced taxonomy, which is correct for a reference but wrong for a guide. Reorder the dependencies guide so the 80% case (bare owner/repo string) comes first and the 20% case (explicit name:/repository:/version: objects) comes second with a "For enterprise / multi-registry setups" header.


3. External Contributor Momentum vs. Incomplete Features

This is the highest-signal external contribution APM has received. Vladimir is building enterprise-grade infrastructure, not filing typo fixes. Shipping this with appropriate scoping (experimental flag, clear prototype language) sends the message that APM takes external contributors seriously and merges real work.

The risk of NOT merging (or stalling) is higher than the risk of merging incomplete. A motivated external contributor who sees their PR languish will not submit a second one. The OSS contributor funnel is: drive-by issue -> small PR -> substantial PR -> repeat contributor. Vladimir is at stage 3. Protect that.

However, "prototype" language in shipped docs is a growth anti-pattern. Five instances of "prototype" and "consume-only" in the docs undermine confidence. Users do not read caveats -- they read the first example, try it, and form an opinion.

Recommendation: Merge the implementation and tests. For the docs, replace "prototype" with "preview" (same honesty, less tentativeness) and add explicit scope markers:

  • "OCI package resolution is in preview. It supports consuming packages from OCI registries. Publishing support is planned."
  • Remove "in this prototype" from every sentence -- it reads as "we are not sure this works."

4. Side-Channel to CEO: Strategic Implications of OCI Support

For the CEO's consideration:

  • Enterprise unlock. OCI support is the bridge from "developer tool" to "platform tool." Every enterprise with a container registry (which is all of them) can now evaluate APM without asking "where do we host packages?" This removes the single biggest objection in enterprise evaluation conversations.

  • Competitive positioning. No other AI agent package manager has OCI registry support. This is a "first mover" claim worth making in release comms -- but only when it works end-to-end. Premature claims erode trust.

  • Story angle for release. When this ships fully wired: "APM now resolves packages from your existing container registry -- Harbor, Artifactory, GHCR, ACR. Your AI agent packages live alongside your container images." That is a one-liner that enterprise platform teams will forward internally.

  • External contributor signal. Vladimir's contribution validates that the architecture is extensible enough for external developers to add major features. This is worth calling out in release notes: "Community-contributed OCI support" reinforces the OSS positioning.

  • Risk if shipped as-is: Docs describe features that apm install cannot execute. A platform engineer who evaluates APM by reading the docs and trying OCI resolution will conclude "this does not work" and not return. Gate behind experimental flag or wire into install pipeline before shipping.


5. Growth Strategy Update

Yes, WIP/growth-strategy.md update is warranted. Key insight to add:

2026-XX-XX -- OCI registry support (PR #900, external contributor)
First external contribution of enterprise-grade infrastructure. OCI/container registry package resolution enables the "use your existing registry" pitch for enterprise adoption. When wired end-to-end, this unlocks: (1) "host packages in your own registry" as a top-of-funnel claim for enterprise, (2) release narrative: "your AI agent packages live next to your container images," (3) GitLab/GHCR as first-class backends broadens appeal beyond GitHub-only shops. Watch for: contributor retention (Vladimir's next PR matters more than this one), docs-to-reality gap (do not ship docs that promise features the CLI cannot deliver), and "preview" vs "prototype" language calibration across all surfaces.


Summary of Actionable Items

Item Priority Owner
Gate OCI docs behind experimental flag OR wire ArtifactResolver into install P0 Author + maintainer
Replace "prototype" with "preview" in all 5 doc locations P1 Author
Reorder dependencies guide: simple case first, advanced taxonomy second P1 Author
Add oras install instructions to docs (or note that APM will prompt) P1 Author
Prepare release narrative for when OCI is fully wired P2 Growth / CEO
Acknowledge Vladimir's contribution in release notes P2 Maintainer

Bottom line: This PR has high strategic value and should be merged, but the docs-to-implementation gap is a conversion risk that must be closed before it ships to users. The fastest path is gating the OCI-specific docs behind the existing experimental feature-flag system.

Generated by PR Review Panel for issue #900 · ● 29.6M ·

@github-actions
Copy link
Copy Markdown

DevX UX Review -- PR #900: Repository-Driven OCI Package Resolution

Assessment Summary

This PR introduces a significant new concept (logical package requirements + repository-driven resolution + OCI transport) into a tool whose UX contract is "install works like npm". The architectural direction is sound -- separating package identity from transport is what every mature package manager does. However, the current prototype has several user-facing gaps that range from a silent breaking regression to missing discoverability surfaces. I'm focusing strictly on UX; auth and supply-chain concerns are deferred to those specialist reviewers.


HIGH severity

H1. Virtual-path references silently break (regression)

Files: src/apm_cli/models/apm_package.py _should_parse_as_requirement() (line 321-336)

The disambiguation heuristic checks whether the first path segment contains a dot. If it doesn't, the string is routed to PackageRequirement.parse() instead of DependencyReference.parse().

This breaks every GitHub shorthand that contains a virtual path (subdirectory, file, or collection):

github/awesome-copilot/skills/review-and-refactor
ComposioHQ/awesome-claude-skills/brand-guidelines

Before this PR: DependencyReference.parse() correctly splits these into host=github.com, repo_url=github/awesome-copilot, virtual_path=skills/review-and-refactor.

After this PR: PackageRequirement.parse() treats the entire string as name=github/awesome-copilot/skills/review-and-refactor. The resolver then builds https://github.com/github/awesome-copilot/skills/review-and-refactor.git -- a URL that does not exist.

This is a silent regression that breaks the quick-start example (apm install github/awesome-copilot/skills/review-and-refactor appears in cli-commands.md line 175 and quick-start.md line 134) and every existing user who has virtual-path dependencies. The heuristic needs a rule: if the string has 3+ path segments, it is likely a virtual-path reference, not a logical requirement (logical requirements are owner/repo or owner/repo#version -- exactly 2 segments before #).

Suggested fix direction: Count path segments before the # fragment. If there are exactly 2 segments (e.g. microsoft/apm-standards), treat as logical requirement. If 3+, fall through to DependencyReference.parse().


H2. Install pipeline is not wired -- user hits a wall

Files: src/apm_cli/install/sources.py, src/apm_cli/install/phases/resolve.py

The ArtifactResolver and OCIRegistryClient are never imported or invoked by the install pipeline. The factory at sources.py:580 (_make_source()) will receive a PackageRequirement and immediately fail because:

  1. dep_ref.is_local raises AttributeError (no such attr on PackageRequirement)
  2. dep_ref.is_virtual raises AttributeError
  3. ctx.downloader.download_package() calls DependencyReference.parse() on the requirement, which will fail or produce incorrect results

This means apm install with a logical requirement in apm.yml will crash with an unhandled exception -- the worst UX failure mode. Even as a prototype, users who see the docs and try the new syntax will get a Python traceback, violating the "failure mode is the product" principle.

Recommendation: Either (a) wire up the install pipeline so it detects PackageRequirement and routes through ArtifactResolver, or (b) add a clear, early-exit error: "Logical package requirements (e.g. 'acme/security-pack') are not yet supported by 'apm install'. Use explicit git URLs for now." The second option is acceptable for a prototype, but the error must fire before any download attempt.


MEDIUM severity

M1. cli-commands.md not updated -- PR is incomplete by project rule

The canonical CLI reference (docs/src/content/docs/reference/cli-commands.md) was not touched in this PR. Per the project's own rule: "If a CLI change is not reflected in cli-commands.md in the same PR, that change is incomplete by definition."

Specifically missing:

  • No mention of ~/.apm/repositories.yml anywhere in the CLI reference
  • No mention of logical requirements vs. direct URLs in apm install argument description
  • No mention of OCI or oras dependency
  • The apm install PACKAGES description still says "Accepts shorthand (owner/repo)" -- this now needs to distinguish shorthand from logical requirement

M2. Config file errors are silently swallowed

File: src/apm_cli/repositories/config.py lines 42-43

except Exception:
    return list(DEFAULT_REPOSITORIES)

If ~/.apm/repositories.yml has a YAML syntax error, the user gets zero feedback -- APM silently falls back to defaults. Compare to pip.conf: pip warns loudly when config is malformed. A user who spends time setting up a corporate OCI registry, makes a typo, and then watches APM try GitHub instead will be confused.

At minimum, emit a warning: "Warning: Could not parse ~/.apm/repositories.yml: {error}. Using default repositories."

M3. No command to inspect or validate repository configuration

There is no apm config list-repositories, apm repos, or equivalent. Users cannot:

  • See which repositories are active and in what priority order
  • Validate their repositories.yml before running install
  • See which repository resolved a given package after install

Compare: npm config list, pip config list, cargo config get registries. All let you confirm "the tool sees what I configured."

Recommendation: At minimum, surface repository info in apm config output. Ideally, add apm config repos or show resolved-repository in apm deps list / apm deps tree output.

M4. oras missing error lacks install guidance

File: src/apm_cli/deps/oci_registry.py line 99

raise RuntimeError(
    "OCI support requires the 'oras' CLI to be installed and available on PATH."
)
```

This names what failed but not how to fix it. Every good CLI error includes a next action. Compare:

```
OCI support requires the 'oras' CLI.
  Install: brew install oras  (macOS)  |  (oras.land/redacted)

The error also surfaces as a raw RuntimeError string -- it will likely appear as a Python traceback in the default path since there's no catch in the install pipeline for this. Users should never see RuntimeError: in output.

M5. Resolution failure messages are opaque

File: src/apm_cli/repositories/resolver.py line 49-50

raise RuntimeError(
    f"Failed to resolve {requirement} from configured repositories: {last_error}"
)
```

When `last_error` is a nested exception (e.g. `git clone` failure), this chains error messages into an unreadable blob. It also doesn't tell the user **which repositories were tried** or in what order. Good UX:

```
Failed to resolve acme/security-pack#1.2.0
  Tried: github (git, https://github.com) -- 404 Not Found
  Tried: gitlab (git, (gitlab.com/redacted)) -- authentication required
  Tried: ghcr (oci, ghcr.io/apm) -- oras: not found

  Check ~/.apm/repositories.yml or pin repository: in apm.yml

M6. Breaking semantic change not communicated to existing users

Previously microsoft/apm-standards was unambiguously "GitHub shorthand." Now it's "logical requirement resolved through repository priority (GitHub first by default)." The PR description says this is "semantically equivalent in the common case" -- but it's architecturally different and will behave differently when:

  1. A user has ~/.apm/repositories.yml with a higher-priority OCI repo
  2. Network conditions make one repo reachable but not another

There is no migration warning, no changelog entry visible to users, and no --legacy-resolve escape hatch. When existing users upgrade and something resolves differently, there's no way for them to understand why.


LOW severity

L1. ghcr default OCI base uses fixed ghcr.io/apm namespace

File: src/apm_cli/repositories/config.py line 25

The built-in default ghcr repository resolves packages under ghcr.io/apm/. This means acme/security-pack becomes ghcr.io/apm/acme/security-pack:version. Is ghcr.io/apm an official Microsoft-controlled namespace? If not, this is a supply-chain concern (deferring to that reviewer), but from UX perspective it's confusing: users will wonder "whose registry am I pulling from?"

L2. ref_type=GitReferenceType.BRANCH for OCI packages

File: src/apm_cli/deps/oci_registry.py line 79

OCI artifacts use GitReferenceType.BRANCH in the ResolvedReference. This is semantically wrong -- OCI tags are not git branches. It leaks git-centric abstractions into a non-git transport and may confuse apm deps list/tree output that shows version provenance.

L3. Missing __init__.py in repositories/ package

Dir: src/apm_cli/repositories/

The directory has config.py and resolver.py but no __init__.py. This works because of implicit namespace packages, but it diverges from the rest of the codebase and may cause issues with some tooling.

L4. Uninstall engine hasattr guard is brittle

File: src/apm_cli/commands/uninstall/engine.py line 127

if not hasattr(dep_ref, "get_install_path"):
    logger.progress(f"Removed {package} from manifest (OCI dependency)")

But PackageRequirement does have get_install_path (defined at line 108 of requirement.py). So this guard never fires for OCI PackageRequirement objects -- the message is dead code. The duck-typing approach here is fragile; prefer isinstance checks against PackageRequirement.

L5. Docs show repositories.yml in ~/.apm/ but don't mention apm init creating it

The quick-start funnel is apm init -> apm install -> apm run. There's no step that creates or scaffolds ~/.apm/repositories.yml. For the common case (GitHub defaults) this is fine, but for enterprise users who need OCI, the docs should include a one-liner to bootstrap the config. Compare: npm config set registry (corp.example.com/redacted).

L6. Duplicate examples in skill resource dependencies.md

File: packages/apm-guide/.apm/skills/apm-usage/dependencies.md

The diff adds microsoft/apm-sample-package#v1.0.0 and acme/security-pack#1.2.0 as new "logical requirement" examples, but then the existing block immediately below still shows microsoft/apm-sample-package three ways. The skill resource now has near-duplicate examples that will confuse agent tools consuming this context.


Summary of recommendations

# Severity Action
H1 HIGH Fix _should_parse_as_requirement to exclude 3+ segment paths (virtual packages)
H2 HIGH Wire install pipeline OR add clear early error for PackageRequirement
M1 MEDIUM Update cli-commands.md in this PR
M2 MEDIUM Warn on malformed repositories.yml instead of silent fallback
M3 MEDIUM Add way to inspect active repositories (apm config or new subcommand)
M4 MEDIUM Add install instructions to oras missing error
M5 MEDIUM Show tried repositories and per-repo errors on resolution failure
M6 MEDIUM Add changelog/migration note about semantic change
L1-L6 LOW Address as polish

Generated by PR Review Panel for issue #900 · ● 29.6M ·

@vlsi
Copy link
Copy Markdown
Author

vlsi commented Apr 25, 2026

@microsoft-github-policy-service agree

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants