[Test Improver] test: add unit tests for policy outcome routing (0% -> ~95%)#943
Conversation
Adds 56 dedicated unit tests for apm_cli.policy.outcome_routing covering the full 9-outcome routing table with all branching conditions. Security properties verified: - hash_mismatch is always fail-closed (PolicyViolationError raised) regardless of fetch_failure_default or logger presence - absent / no_git_remote / empty are always fail-open even when fetch_failure_default=block (they are not network-fetch failures) - malformed / cache_miss_fetch_fail / garbage_response respect the project-level fetch_failure_default knob - cached_stale respects the cached policy's own fetch_failure knob - disabled short-circuits without logging - raise_blocking_errors=False (dry-run path) bypasses all raises - None logger is tolerated across all outcomes Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
APM Review Panel VerdictDisposition: APPROVE Per-persona findingsPython Architect: PR adds a single test module ( OO / class diagram classDiagram
direction LR
class route_discovery_outcome {
<<Pure Function>>
+route(fetch_result, logger, fetch_failure_default, raise_blocking_errors) ApmPolicy
}
class PolicyFetchResult {
<<ValueObject>>
+outcome str
+source str
+policy ApmPolicy
+error str
+fetch_error str
+cached bool
+cache_age_seconds int
}
class ApmPolicy {
<<ValueObject>>
+enforcement str
+fetch_failure str
}
class PolicyViolationError {
<<Exception>>
+policy_source str
}
class _FETCH_FAILURE_OUTCOMES {
<<Constant>>
}
class _NON_FOUND_LOGGED_OUTCOMES {
<<Constant>>
}
class test_outcome_routing {
<<TestModule>>
+TestModuleConstants
+TestDisabledOutcome
+TestHashMismatchOutcome
+TestAlwaysFailOpenOutcomes
+TestFetchFailureOutcomes
+TestCachedStaleOutcome
+TestFoundOutcome
+TestUnknownOutcome
+TestErrorFieldRouting
}
route_discovery_outcome ..> PolicyFetchResult : reads
route_discovery_outcome ..> ApmPolicy : returns
route_discovery_outcome ..> PolicyViolationError : raises
route_discovery_outcome ..> _FETCH_FAILURE_OUTCOMES : checks
route_discovery_outcome ..> _NON_FOUND_LOGGED_OUTCOMES : checks
test_outcome_routing ..> route_discovery_outcome : exercises
test_outcome_routing ..> PolicyFetchResult : constructs
test_outcome_routing ..> ApmPolicy : constructs
test_outcome_routing ..> PolicyViolationError : asserts on
class test_outcome_routing:::touched
classDef touched fill:#fff3b0,stroke:#d47600
Execution flow diagram flowchart TD
A["pytest\ntests/unit/policy/test_outcome_routing.py"] --> B["route_discovery_outcome\noutcome_routing.py"]
B --> C{outcome?}
C -->|disabled| D["return None -- no logging"]
C -->|hash_mismatch| E["logger.policy_discovery_miss\noutcome, source, error"]
E --> F{raise_blocking_errors?}
F -->|True| G["raise PolicyViolationError\nhash mismatch"]
F -->|False| H["return None -- dry-run"]
C -->|"absent / no_git_remote / empty"| I["logger.policy_discovery_miss\noutcome, source"]
I --> J["return None -- always fail-open"]
C -->|"malformed / cache_miss_fetch_fail / garbage_response"| K["logger.policy_discovery_miss\noutcome, source, error"]
K --> L{fetch_failure_default?}
L -->|warn| M["return None"]
L -->|block| N{raise_blocking_errors?}
N -->|True| O["raise PolicyViolationError"]
N -->|False| P["return None -- dry-run"]
C -->|cached_stale| Q["logger.policy_resolved\n+ logger.policy_discovery_miss"]
Q --> R{policy.fetch_failure?}
R -->|block| S["raise PolicyViolationError\nrefresh failed"]
R -->|warn| T["return policy"]
C -->|found| U["logger.policy_resolved\nsource, cached, enforcement, age_seconds"]
U --> V["return policy"]
C -->|unknown| W["return None -- defensive fallback"]
Design patterns
The test class-per-outcome structure mirrors the routing table in CLI Logging Expert: No production logging code added. Tests correctly exercise the logger interface via DevX UX Expert: No CLI surface changes. The tests validate that Supply Chain Security Expert: Five critical security properties are explicitly verified and deserve to be called out:
Non-blocking observation: No new security surface introduced. Auth Expert: Not activated -- PR changes only OSS Growth Hacker: Automated test improvement (0% to ~95%) on a security-critical module. The test file's structured docstrings and outcome-per-class organization make it a reference implementation for contributors writing future policy tests -- good for the contributor funnel. No direct conversion surface (README, quickstart, templates) affected by this PR. Side-channel to CEO: The Daily Test Improver agentic workflow autonomously generating 56 security-validated tests is itself a story angle ("APM maintains its own security test coverage with AI agents") that reinforces the AI-native development positioning. Flag for a future release narrative or CHANGELOG entry. CEO arbitrationSpecialists agree unanimously: this is a well-executed, security-relevant test coverage addition for a critical routing module. The Supply Chain Security Expert correctly identifies the five explicit security property verifications as the most valuable part of this PR. The Python Architect finds no structural issues; the test shape mirrors the production code's routing table -- the right organizational choice. No specialist raised a blocking concern, there are no trade-offs to arbitrate, and no breaking changes are introduced. The Growth Hacker's side-channel on the agentic self-improvement narrative is noted and will be considered for future release communications. Disposition: APPROVE. Required actions before mergeNone. Optional follow-ups
|
…ivation (#948) The label-name guard previously lived inside the agent prompt as Step 0 ('exit 0 if label != panel-review'). This had two failure modes: 1. EVERY label add on EVERY PR triggered the workflow, spent ~50s on pre_activation + activation + apm bundle restore + agent container spin-up, then asked the LLM to please exit early. Observed real cost: PR #943 was labelled 'automation'/'testing' (never 'panel-review') and the agent ran for 5 min 12 s before stopping. 2. The 'exit 0' instruction was a prompt-level request to the LLM, not a deterministic gate. An LLM that decides the PR diff is interesting can ignore the instruction and proceed. Fix: move the guard to gh-aw's 'on.steps:' (pre-activation step). When the triggering label is not 'panel-review' (and the event is not workflow_dispatch), the step exits 1 -> pre_activation job fails -> all downstream jobs (activation, apm, agent) skip. Total cost when filtered out: one ubuntu-slim runner for ~10s, no LLM, no bundle restore. gh-aw does not expose 'names:' on pull_request_target (verified at compile time), so 'on.steps:' is the cheapest available gate. The prompt's Step 0 is removed; a short note documents that filtering now happens at the workflow level. Lock file regenerated via 'gh aw compile pr-review-panel'. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
🤖 Test Improver here - I'm an automated AI assistant focused on improving tests for this repository.
Goal and Rationale
apm_cli.policy.outcome_routingis the single source of truth for all 9-outcome policy-discovery routing decisions across the install pipeline. It was extracted from two duplicated implementations in PR #832, but had no dedicated test file — only a few smoke tests intest_pr_832_findings.py.This module controls critical security behavior: whether a failed policy fetch causes an install to be blocked or silently allowed. The routing logic is non-trivial and the security properties need to be explicitly verified.
Approach
Added
tests/unit/policy/test_outcome_routing.pywith 56 unit tests covering the complete routing table:disabledhash_mismatchabsentno_git_remoteemptymalformedfetch_failure_defaultcache_miss_fetch_failfetch_failure_defaultgarbage_responsefetch_failure_defaultcached_stalepolicy.fetch_failurefoundSecurity properties explicitly verified:
hash_mismatchalways raisesPolicyViolationErrorregardless offetch_failure_default(even"warn")absent/no_git_remote/emptyare always fail-open even whenfetch_failure_default="block"— they indicate "no org policy", not a network errorraise_blocking_errors=False(dry-run path) suppresses all raises across the boardNonelogger is tolerated for non-CLI callers in every code path_FETCH_FAILURE_OUTCOMESis a strict subset of_NON_FOUND_LOGGED_OUTCOMES(invariant test)Coverage Impact
src/apm_cli/policy/outcome_routing.pyTrade-offs
PolicyFetchResultandApmPolicyare used directly (not mocked) for clarityTest Status
Reproducibility