Add agent capability tiers, rule units, and sub-agent delegation#81
Add agent capability tiers, rule units, and sub-agent delegation#81oshorefueled wants to merge 44 commits intomainfrom
Conversation
- add provider-scoped capability-tier env fields and upward-only resolution - update global and example config templates for tier-specific models - cover parser and resolver behavior with focused provider tests
- add a capability-aware provider bundle for agent-mode model routing - split top-level loop and lint execution across resolved capability tiers - return final agent-loop text and cover the new routing in focused tests
- replace lint input with explicit rules[] rule calls - add subordinate agent tool schema with capability-tier selection - register the new tool descriptions and cover the contracts in tests
- extract shared token estimation and group matched rules under a budget - render matched rule units explicitly in the agent system prompt - precompute rule units before the agent loop and cover the flow in tests
- add a bundled structured-output schema with per-finding rule attribution - run one provider request for each multi-rule lint call instead of per-rule loops - cover bundled routing, prompt isolation, and severity preservation in tests
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughIntroduces provider capability-tier resolution and upward-only fallback, refactors agent execution to group matched rules into token-budgeted "matched rule units", adds sub-agent delegation with bounded read-only tools, and extracts CLI agent/file evaluation and result-routing into focused modules; updates schemas, provider factory, and tests to match the new flows. Changes
Sequence Diagram(s)sequenceDiagram
participant CLI
participant CapabilityResolver as CapabilityResolver
participant DefaultProvider as DefaultProvider
participant TierProvider as TierProvider
participant AgentExecutor as AgentExecutor
participant ToolHandlers as ToolHandlers
participant SubAgent as SubAgent
CLI->>CapabilityResolver: createCapabilityProviderResolver(env)
CLI->>AgentExecutor: runAgentExecutor(scanPaths, systemDirective, capabilityResolver)
AgentExecutor->>ToolHandlers: createToolHandlers(defaultProvider, resolveCapabilityProvider, ...)
ToolHandlers->>TierProvider: resolveCapabilityProvider(requestedTier)
ToolHandlers->>TierProvider: runAgentToolLoop(lint/agent)
ToolHandlers->>SubAgent: runSubAgent(task, tools, provider=tierProvider)
SubAgent->>TierProvider: runAgentToolLoop(sub-agent systemPrompt)
SubAgent-->>ToolHandlers: compactResult / usage
TierProvider-->>ToolHandlers: lint findings / usage
ToolHandlers-->>AgentExecutor: append findings / nested usage
AgentExecutor-->>CLI: aggregated findings, usage, operational flags
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
Poem
✨ Finishing Touches🧪 Generate unit tests (beta)
|
There was a problem hiding this comment.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/config/global-config.ts (1)
13-86:⚠️ Potential issue | 🟡 MinorKeep the generated config template aligned with
.env.example.These two copy-paste templates have already drifted: this file includes the OpenAI tier block while
.env.examplestill starts at Azure, and the Anthropic default model values differ between the two files. Shipping both as-is will give users conflicting keys/defaults. Please sync them in this PR, or better, generate both from shared provider metadata.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/config/global-config.ts` around lines 13 - 86, DEFAULT_GLOBAL_CONFIG_TEMPLATE diverges from .env.example (provider order and Anthropic model defaults); update the DEFAULT_GLOBAL_CONFIG_TEMPLATE string so it matches .env.example exactly (remove or reorder the OpenAI tier block so the file starts at Azure if .env.example does, and make ANTHROPIC_MODEL / ANTHROPIC_HIGH_CAPABILITY_MODEL / ANTHROPIC_MID_CAPABILITY_MODEL / ANTHROPIC_LOW_CAPABILITY_MODEL values identical to .env.example). Preferably refactor to generate DEFAULT_GLOBAL_CONFIG_TEMPLATE from the shared provider metadata source (or a single canonical constants object) rather than copy-pasting, and ensure the template string and the canonical source use the same symbols (e.g., DEFAULT_GLOBAL_CONFIG_TEMPLATE, ANTHROPIC_MODEL) so future drift is prevented.
🧹 Nitpick comments (7)
src/providers/model-capability.ts (2)
1-2: Decouple capability resolution fromprovider-factory.Importing
ProviderTypefrom the SDK-heavy factory makes this config-only module transitively pull provider transport code into runtime consumers. MovingProviderTypeinto a small shared types/constants module would keep this resolver lightweight and avoid that dependency leak.As per coding guidelines, "Define shared domain constants, enums, or types for core runtime concepts; avoid magic strings in shared runtime code."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/providers/model-capability.ts` around lines 1 - 2, The module model-capability.ts currently imports ProviderType from provider-factory which drags in heavy SDK/transport code; extract ProviderType into a lightweight shared types/constants module (e.g., create a new types file exporting the ProviderType enum/const) and update model-capability.ts to import ProviderType from that new shared module instead of './provider-factory'; ensure the new shared module only contains the type/enum (no runtime/provider logic) and adjust any other files that import ProviderType to use the new module.
89-90: Use a custom config/provider error here too.Line 90 is another new raw
Errorpath insrc/providers. Align it with the repo's error hierarchy so unsupported-provider states surface consistently.As per coding guidelines, "Prefer the repository's custom error hierarchy over native
Error; catch blocks useunknowntype and extend existing custom error types before introducing raw exceptions."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/providers/model-capability.ts` around lines 89 - 90, Replace the raw throw new Error in the default branch with the repository's custom unsupported-provider error type (e.g., UnsupportedProviderError or the appropriate subclass of your existing custom error hierarchy) so unsupported LLM_PROVIDER values surface consistently; import the custom error at the top of src/providers/model-capability.ts, construct it with a descriptive message that includes the actual (envConfig as { LLM_PROVIDER?: string }).LLM_PROVIDER value, and ensure any upstream catch sites that currently handle unknown are compatible with the custom error type.src/providers/provider-factory.ts (1)
28-43: Use the repo error hierarchy for unsupported-provider failures.Line 41 adds a new raw
Errorpath for configuration failures. Please throw the existing provider/config error type here so setup failures stay classifiable.As per coding guidelines, "Prefer the repository's custom error hierarchy over native
Error; catch blocks useunknowntype and extend existing custom error types before introducing raw exceptions."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/providers/provider-factory.ts` around lines 28 - 43, The switch in resolveDefaultModelIdentifier currently throws a raw Error for unknown provider; replace that with the repository's custom provider/config error (e.g., UnsupportedProviderError or ProviderConfigError) and include the LLM_PROVIDER value in the error payload/message so failures remain classifiable; update the top of the file to import the custom error type, and construct/throw it instead of new Error(`Unsupported provider type: ...`) inside resolveDefaultModelIdentifier while preserving the original message content and context.docs/plans/2026-04-03-agent-rule-units-and-subordinate-agent.md (1)
521-524: Minor style: adverb repetition.Line 523 uses "only" twice: "returns compact final text only - the sub-agent only receives". Consider rephrasing for clarity.
✏️ Suggested fix
-- the sub-agent runs synchronously and returns compact final text only -- the sub-agent only receives `read_file`, `search_files`, `list_directory`, `search_content` +- the sub-agent runs synchronously and returns compact final text +- the sub-agent receives only `read_file`, `search_files`, `list_directory`, `search_content`🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/plans/2026-04-03-agent-rule-units-and-subordinate-agent.md` around lines 521 - 524, The phrasing repeats the adverb "only" in the two bullet points; change the wording so it's not redundant—for example, rewrite "the sub-agent runs synchronously and returns compact final text only" to "the sub-agent runs synchronously and returns compact final text; it receives only `read_file`, `search_files`, `list_directory`, and `search_content`" so the two concepts (return behavior and permitted actions) are clear and the second "only" applies directly to the list of allowed operations.tests/agent/rule-units.test.ts (1)
58-91: Consider adding edge case tests for boundary conditions.The token budget splitting test is well-implemented, but consider adding tests for edge cases:
- Empty
fileRuleMatchesarray (should return empty array)- Single match with budget smaller than its estimated tokens
- All matches fit under budget (no splitting needed)
These would improve confidence in the boundary handling logic.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/agent/rule-units.test.ts` around lines 58 - 91, Add three new tests for buildMatchedRuleUnits covering boundary cases: (1) empty fileRuleMatches array should return an empty array, (2) a single match whose prompt estimated tokens exceed the provided budget should still produce one unit containing that single rule (with estimatedTokens present and numeric), and (3) multiple matches whose combined estimated tokens are <= budget should produce a single unit containing all rules. Use the existing test helpers (makePrompt / PromptFile) to construct prompts and pass the tokenBudget argument to buildMatchedRuleUnits to validate these behaviors.docs/superpowers/specs/2026-04-02-agent-rule-units-design.md (1)
281-281: Minor: Consider conciseness improvement."all of the delegated context" could be shortened to "the delegated context" without losing meaning.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/superpowers/specs/2026-04-02-agent-rule-units-design.md` at line 281, Update the sentence in the spec that reads "receive the result, and continue without carrying all of the delegated context forward in the main conversation" to the more concise "receive the result, and continue without carrying the delegated context forward in the main conversation" by removing the word "all"; locate the sentence in the paragraph that begins "The sub-agent `agent` tool is not limited to exploration" and replace the phrase to keep meaning while reducing redundancy.src/agent/executor.ts (1)
536-538: Use a repo error type for the missing-provider guard.This new entry-point check throws raw
Error, which makes provider-resolution failures harder to classify consistently with the rest of the agent path.As per coding guidelines "Prefer the repository's custom error hierarchy over native
Error; catch blocks useunknowntype and extend existing custom error types before introducing raw exceptions."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/agent/executor.ts` around lines 536 - 538, Replace the raw throw in the runAgentExecutor missing-provider guard with the repository's custom error type (e.g., ProviderResolutionError or AgentError): import the appropriate error class from the project's errors module, then throw new ProviderResolutionError('runAgentExecutor requires at least one provider.') (or wrap additional context/metadata as the custom error supports) instead of throw new Error(...); update the import list to include the chosen error type and ensure no other code expects a raw Error for this check.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.env.example:
- Around line 49-52: Remove the extra empty line before the search provider
section in .env.example so the comment "# Search Provider Configuration"
immediately follows the last variable entry (e.g., after the commented
"BEDROCK_TEMPERATURE" line); this eliminates the dotenv-linter warning by
ensuring there are no consecutive blank lines between environment blocks.
In `@docs/plans/2026-04-03-agent-rule-units-and-subordinate-agent.md`:
- Around line 1-13: The document jumps from the top-level heading "# Agent Rule
Units and Sub-Agent Implementation Plan" directly to "### Task 1", breaking
heading hierarchy; update the task section(s) so they use a second-level heading
(change "### Task 1" to "## Task 1" or insert an H2 before the task list) and
ensure subsequent task headings consistently use H2 (e.g., "## Task 2", etc.) so
the document follows a correct H1→H2→H3 structure and improves navigation and
accessibility.
In `@src/agent/executor.ts`:
- Around line 148-160: The function buildEffectiveRuleBody currently lets
reviewInstruction replace prompt.body which causes tool evaluations to run on
non-canonical text; change it so prompt.body is always included first
(preserving the canonical rule text from PromptFile), then append the
reviewInstruction (e.g., as a supplemental or override note) only if present,
and finally append the context block if provided; update the implementation in
buildEffectiveRuleBody to concatenate prompt.body + optional reviewInstruction
(with clear separation) + optional context so ruleSource attribution remains
accurate.
- Around line 683-695: The loop is recording findings from result.data.findings
without ensuring finding.ruleSource is one of the requested rules
(promptBySource), which lets unrelated/typo sources (e.g., "C") be recorded and
then throw via buildUnknownRuleSourceError after some findings were appended.
Before calling resolvePromptBySource and appendInlineFinding, check whether
finding.ruleSource exists in the promptBySource (or validSources) set; if it is
missing, skip this finding (or collect/report it inline) instead of throwing to
avoid aborting the entire batch after partial appends. Update the loop around
resolvePromptBySource, appendInlineFinding, and buildUnknownRuleSourceError to
only call appendInlineFinding for sources present in promptBySource and handle
unknown sources by continuing rather than throwing.
In `@tests/env-parser.test.ts`:
- Around line 233-266: The test case "parses provider-scoped capability-tier
model fields for Gemini and Bedrock" is incorrectly nested inside the "OpenAI
Configuration" describe; move that entire it(...) block out of the "OpenAI
Configuration" describe and place it in its own describe (e.g.,
describe('Provider-scoped capability-tier models', ...)) or alongside other
provider tests so it runs independently; keep the test body unchanged (it
references parseEnvironment and ProviderType.Gemini /
ProviderType.AmazonBedrock) and adjust surrounding braces/indentation to close
the OpenAI describe before adding the new describe.
---
Outside diff comments:
In `@src/config/global-config.ts`:
- Around line 13-86: DEFAULT_GLOBAL_CONFIG_TEMPLATE diverges from .env.example
(provider order and Anthropic model defaults); update the
DEFAULT_GLOBAL_CONFIG_TEMPLATE string so it matches .env.example exactly (remove
or reorder the OpenAI tier block so the file starts at Azure if .env.example
does, and make ANTHROPIC_MODEL / ANTHROPIC_HIGH_CAPABILITY_MODEL /
ANTHROPIC_MID_CAPABILITY_MODEL / ANTHROPIC_LOW_CAPABILITY_MODEL values identical
to .env.example). Preferably refactor to generate DEFAULT_GLOBAL_CONFIG_TEMPLATE
from the shared provider metadata source (or a single canonical constants
object) rather than copy-pasting, and ensure the template string and the
canonical source use the same symbols (e.g., DEFAULT_GLOBAL_CONFIG_TEMPLATE,
ANTHROPIC_MODEL) so future drift is prevented.
---
Nitpick comments:
In `@docs/plans/2026-04-03-agent-rule-units-and-subordinate-agent.md`:
- Around line 521-524: The phrasing repeats the adverb "only" in the two bullet
points; change the wording so it's not redundant—for example, rewrite "the
sub-agent runs synchronously and returns compact final text only" to "the
sub-agent runs synchronously and returns compact final text; it receives only
`read_file`, `search_files`, `list_directory`, and `search_content`" so the two
concepts (return behavior and permitted actions) are clear and the second "only"
applies directly to the list of allowed operations.
In `@docs/superpowers/specs/2026-04-02-agent-rule-units-design.md`:
- Line 281: Update the sentence in the spec that reads "receive the result, and
continue without carrying all of the delegated context forward in the main
conversation" to the more concise "receive the result, and continue without
carrying the delegated context forward in the main conversation" by removing the
word "all"; locate the sentence in the paragraph that begins "The sub-agent
`agent` tool is not limited to exploration" and replace the phrase to keep
meaning while reducing redundancy.
In `@src/agent/executor.ts`:
- Around line 536-538: Replace the raw throw in the runAgentExecutor
missing-provider guard with the repository's custom error type (e.g.,
ProviderResolutionError or AgentError): import the appropriate error class from
the project's errors module, then throw new
ProviderResolutionError('runAgentExecutor requires at least one provider.') (or
wrap additional context/metadata as the custom error supports) instead of throw
new Error(...); update the import list to include the chosen error type and
ensure no other code expects a raw Error for this check.
In `@src/providers/model-capability.ts`:
- Around line 1-2: The module model-capability.ts currently imports ProviderType
from provider-factory which drags in heavy SDK/transport code; extract
ProviderType into a lightweight shared types/constants module (e.g., create a
new types file exporting the ProviderType enum/const) and update
model-capability.ts to import ProviderType from that new shared module instead
of './provider-factory'; ensure the new shared module only contains the
type/enum (no runtime/provider logic) and adjust any other files that import
ProviderType to use the new module.
- Around line 89-90: Replace the raw throw new Error in the default branch with
the repository's custom unsupported-provider error type (e.g.,
UnsupportedProviderError or the appropriate subclass of your existing custom
error hierarchy) so unsupported LLM_PROVIDER values surface consistently; import
the custom error at the top of src/providers/model-capability.ts, construct it
with a descriptive message that includes the actual (envConfig as {
LLM_PROVIDER?: string }).LLM_PROVIDER value, and ensure any upstream catch sites
that currently handle unknown are compatible with the custom error type.
In `@src/providers/provider-factory.ts`:
- Around line 28-43: The switch in resolveDefaultModelIdentifier currently
throws a raw Error for unknown provider; replace that with the repository's
custom provider/config error (e.g., UnsupportedProviderError or
ProviderConfigError) and include the LLM_PROVIDER value in the error
payload/message so failures remain classifiable; update the top of the file to
import the custom error type, and construct/throw it instead of new
Error(`Unsupported provider type: ...`) inside resolveDefaultModelIdentifier
while preserving the original message content and context.
In `@tests/agent/rule-units.test.ts`:
- Around line 58-91: Add three new tests for buildMatchedRuleUnits covering
boundary cases: (1) empty fileRuleMatches array should return an empty array,
(2) a single match whose prompt estimated tokens exceed the provided budget
should still produce one unit containing that single rule (with estimatedTokens
present and numeric), and (3) multiple matches whose combined estimated tokens
are <= budget should produce a single unit containing all rules. Use the
existing test helpers (makePrompt / PromptFile) to construct prompts and pass
the tokenBudget argument to buildMatchedRuleUnits to validate these behaviors.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: b3ae15f8-2b25-445f-ba19-dd5d08d43a64
📒 Files selected for processing (35)
.env.exampleCONFIGURATION.mdREADME.mddocs/logs/2026-04-03-agent-rule-units-and-subordinate-agent.log.mddocs/plans/2026-04-03-agent-rule-units-and-subordinate-agent.mddocs/superpowers/specs/2026-04-02-agent-rule-units-design.mdsrc/agent/executor.tssrc/agent/prompt-builder.tssrc/agent/rule-units.tssrc/agent/sub-agent.tssrc/agent/tools-registry.tssrc/agent/types.tssrc/boundaries/user-instruction-loader.tssrc/cli/commands.tssrc/cli/orchestrator.tssrc/cli/types.tssrc/config/global-config.tssrc/prompts/schema.tssrc/providers/capability-provider-bundle.tssrc/providers/llm-provider.tssrc/providers/model-capability.tssrc/providers/provider-factory.tssrc/providers/vercel-ai-provider.tssrc/schemas/env-schemas.tssrc/utils/token-estimate.tstests/agent/agent-executor.test.tstests/agent/prompt-builder.test.tstests/agent/rule-units.test.tstests/agent/sub-agent.test.tstests/agent/types-contract.test.tstests/env-parser.test.tstests/global-config.test.tstests/orchestrator-agent-output.test.tstests/provider-factory.test.tstests/providers/vercel-ai-provider-agent-loop.test.ts
Agent Capability Tiers, Rule Units, and Sub-Agents1. Capability ChangesAfter merge, you can:
2. Boundary Changes
3. High-Level Flow
4. Invariants
5. Failure Modes and Signals
6. VerificationChecks run:
Quick manual verification:
7. Rollback
|
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/agent/executor.ts`:
- Around line 535-538: Replace the native Error thrown when no provider is
configured with the repository's ConfigError: inside runAgentExecutor where
defaultProvider is determined (the block checking if (!defaultProvider)), change
throw new Error(...) to throw new ConfigError('runAgentExecutor requires at
least one provider.'); and add the necessary import for ConfigError at the top
of the file so the symbol resolves.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 5854367c-2165-4cdd-92e6-a0ff615a5689
📒 Files selected for processing (4)
.env.exampleartifacts/2026-04-03-agent-capability-tiers-rule-units-and-sub-agents.mdsrc/agent/executor.tstests/agent/agent-executor.test.ts
✅ Files skipped from review due to trivial changes (1)
- artifacts/2026-04-03-agent-capability-tiers-rule-units-and-sub-agents.md
🚧 Files skipped from review as they are similar to previous changes (1)
- tests/agent/agent-executor.test.ts
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/providers/model-capability.ts`:
- Around line 15-27: The resolver resolveCapabilityValue currently returns
defaultValue without validating it; change resolveCapabilityValue to treat
empty/whitespace strings as missing and fail fast by throwing a descriptive
error when no non-blank value is found across TIER_SEARCH_ORDER[requested] and
the provided defaultValue is blank, so callers (e.g., capability-provider-bundle
and provider-factory) receive a config-time exception rather than a propagated
invalid identifier.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 7788eb18-c713-441f-acc5-01702ff467fc
📒 Files selected for processing (5)
.env.exampleartifacts/2026-04-03-agent-capability-tiers-rule-units-and-sub-agents.mdsrc/agent/executor.tssrc/providers/model-capability.tssrc/providers/provider-factory.ts
✅ Files skipped from review due to trivial changes (2)
- artifacts/2026-04-03-agent-capability-tiers-rule-units-and-sub-agents.md
- src/agent/executor.ts
🚧 Files skipped from review as they are similar to previous changes (2)
- .env.example
- src/providers/provider-factory.ts
- trim capability-tier identifiers before resolving provider models - throw ConfigError when no non-blank default or tier value exists - cover the config-time failure path with a focused provider test
- route uncapped agent-mode work through the default provider and let lint opt into capability tiers per call - replace bundled lint naming with merged terminology and share the base finding schema across merged responses - rename the provider bundle to a resolver, shorten capability tier names, and remove the tracked artifact and redundant sub-agent write guard
- Keep orchestrator as a thin composition root for mode dispatch, aggregation, formatter selection, and final reporting - Extract per-file evaluation, prompt-result processing, violation reporting, and agent-mode flow into dedicated modules - Preserve runtime behavior while reducing cross-cutting coupling and making future changes easier to test and review
- Rename evaluation flow modules to plain responsibility-based names (agent-evaluation, file-evaluation, result-routing, issue-output) - Update orchestrator and internal imports to match the new naming convention without changing runtime behavior
There was a problem hiding this comment.
Actionable comments posted: 7
🧹 Nitpick comments (2)
src/agent/sub-agent.ts (1)
42-48: Missingusagein error response path.When an error is thrown, any partial
usagefrom the provider is lost. If the provider tracks token usage before failing, this data should be preserved for aggregation.♻️ Proposed fix to preserve usage on error
} catch (error) { const message = error instanceof Error ? error.message : String(error); return { ok: false, error: message, + // Note: usage is not available from thrown errors; if needed, + // the provider would need to include it in a custom error type }; }This is a minor consideration since standard errors don't carry usage data, but worth noting if future provider implementations need to surface partial usage on failure.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/agent/sub-agent.ts` around lines 42 - 48, The catch block in sub-agent.ts currently returns only { ok: false, error: message } and drops any provider token usage; change the error path in the catch inside the function handling provider calls (the catch that builds the returned object) to include usage when present by extracting usage from the thrown error or from the local usage variable (e.g., check (error as any).usage || usage) and return it as a usage field alongside ok:false and error so partial token usage is preserved for aggregation.src/agent/types.ts (1)
35-47: Consider usingTRIMMED_NON_BLANK_STRINGfor consistency.The
TOP_LEVEL_REFERENCE_SCHEMAandTOP_LEVEL_REPORT_INPUT_SCHEMAusez.string().min(1)without trimming, while the new schemas useTRIMMED_NON_BLANK_STRING. This could allow whitespace-only strings like" "to pass validation.♻️ Proposed fix for consistency
export const TOP_LEVEL_REFERENCE_SCHEMA = z.object({ - file: z.string().min(1), + file: TRIMMED_NON_BLANK_STRING, startLine: z.number().int().positive(), endLine: z.number().int().positive(), }); export const TOP_LEVEL_REPORT_INPUT_SCHEMA = z.object({ kind: z.literal('top-level'), - ruleSource: z.string().min(1), - message: z.string().min(1), + ruleSource: TRIMMED_NON_BLANK_STRING, + message: TRIMMED_NON_BLANK_STRING, suggestion: z.string().optional(), references: z.array(TOP_LEVEL_REFERENCE_SCHEMA).optional(), });🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/agent/types.ts` around lines 35 - 47, Replace plain z.string().min(1) usages with the existing TRIMMED_NON_BLANK_STRING to prevent whitespace-only values: update TOP_LEVEL_REFERENCE_SCHEMA's file to use TRIMMED_NON_BLANK_STRING, update TOP_LEVEL_REPORT_INPUT_SCHEMA's ruleSource and message to use TRIMMED_NON_BLANK_STRING, and make suggestion TRIMMED_NON_BLANK_STRING.optional() if you want to trim/validate non-blank optional suggestions; keep references as-is. Ensure you import or reference the TRIMMED_NON_BLANK_STRING constant where these schemas are defined.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/cli/agent-evaluation.ts`:
- Around line 96-107: In buildAgentRuleScores (used by evaluateFilesInAgentMode)
don't treat matchedFiles.size === 0 as a perfect pass; instead either omit that
rule from results or mark it explicitly as "not evaluated" (e.g., set score to 0
and scoreText to 'N/A') so disabled/out-of-scope prompts aren't shown as
successful; update the same logic where it's repeated (around the other block at
lines ~221-225) and reference normalizeRuleSource, findingsByRule, and
filesByRuleSource when locating the conditional to change.
In `@src/cli/file-evaluation.ts`:
- Around line 248-250: The map key currently uses (p.meta.id || p.filename)
which allows collisions across packs; change the key to the pack-qualified rule
name used elsewhere so entries are unique per pack. Specifically, build the key
by prefixing the rule id or filename with the pack identifier (for example using
p.meta.pack or p.pack if present) and use that value when calling
allScores.set(ruleName, promptResult.scoreEntries) so summaries are namespaced
by pack rather than overwritten by duplicate ids.
- Around line 71-79: Replace the generic Error thrown when frontmatter is
missing criteria with the repository's config error type: catch the condition
inside the evaluatorType !== baseEvaluatorType branch and throw ConfigError (or
the repo's equivalent) instead of new Error, include the promptFile.filename in
the message, and ensure ConfigError is imported where file-evaluation.ts uses
evaluatorType, baseEvaluatorType, meta and promptFile.filename so
handleUnknownError() can classify the validation failure properly.
In `@src/cli/issue-output.ts`:
- Around line 22-31: The buildRuleName function can emit leading dots when
packName is an empty string; update buildRuleName (which takes packName, ruleId,
criterionId) to drop empty or falsy segments before joining so empty packName
(or empty criterionId) doesn't produce malformed names; e.g., filter the parts
array to remove ''/falsy values before returning parts.join('.') so only
non-empty segments are joined.
In `@src/cli/result-routing.ts`:
- Around line 395-403: The lookup for criterionId drops to prompt-level because
meta.criteria is matched case-sensitively; update the match in the
violationsByCriterion loop so criterionId is resolved case-insensitively by
comparing criterionName.toLowerCase() with each c.name.toLowerCase() (guarding
for undefined names) before calling buildRuleName(promptFile.pack, promptId,
criterionId) so criterion-specific rules are used even when the model returns
different case.
In `@src/config/global-config.ts`:
- Around line 18-20: Replace the outdated template comment that begins with "#
In agent mode, the top-level loop uses the default provider, lint uses the
default provider unless a call passes model, and delegated sub-agents do the
same." in src/config/global-config.ts so it reflects the new defaults: top-level
agent work should state it defaults to high-capability, the bundled "lint"
should state it defaults to mid-capability, and delegated "agent" calls should
state they default to high-capability when "model" is omitted; update the
wording accordingly and keep the same comment block location and formatting.
In `@tests/agent/agent-executor.test.ts`:
- Around line 348-395: The test "uses the default provider when the agent tool
omits model" incorrectly asserts that resolveCapabilityProvider should not be
called and thus permits the executor to use the default provider for delegated
agent() calls; update the test to expect the executor will call
resolveCapabilityProvider with the high-capability tier when model is omitted.
Specifically, in the test around runAgentExecutor invocation, change the
resolveCapabilityProvider stub to assert it is called with argument 'high-cap'
(or 'high-capability' if that is the code's canonical string) and return the
high-capability provider, incrementing requestedTierCount accordingly, and
adjust the final expectation to ensure requestedTierCount reflects that
resolveCapabilityProvider('high-cap') was invoked; keep references to
runAgentExecutor and resolveCapabilityProvider so the change is applied to the
right place.
---
Nitpick comments:
In `@src/agent/sub-agent.ts`:
- Around line 42-48: The catch block in sub-agent.ts currently returns only {
ok: false, error: message } and drops any provider token usage; change the error
path in the catch inside the function handling provider calls (the catch that
builds the returned object) to include usage when present by extracting usage
from the thrown error or from the local usage variable (e.g., check (error as
any).usage || usage) and return it as a usage field alongside ok:false and error
so partial token usage is preserved for aggregation.
In `@src/agent/types.ts`:
- Around line 35-47: Replace plain z.string().min(1) usages with the existing
TRIMMED_NON_BLANK_STRING to prevent whitespace-only values: update
TOP_LEVEL_REFERENCE_SCHEMA's file to use TRIMMED_NON_BLANK_STRING, update
TOP_LEVEL_REPORT_INPUT_SCHEMA's ruleSource and message to use
TRIMMED_NON_BLANK_STRING, and make suggestion
TRIMMED_NON_BLANK_STRING.optional() if you want to trim/validate non-blank
optional suggestions; keep references as-is. Ensure you import or reference the
TRIMMED_NON_BLANK_STRING constant where these schemas are defined.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: ca0756e3-a098-42fd-8d94-ea709272b01a
📒 Files selected for processing (22)
CONFIGURATION.mdREADME.mdsrc/agent/executor.tssrc/agent/sub-agent.tssrc/agent/types.tssrc/cli/agent-evaluation.tssrc/cli/agent-mode-capability.tssrc/cli/commands.tssrc/cli/file-evaluation.tssrc/cli/issue-output.tssrc/cli/orchestrator.tssrc/cli/result-routing.tssrc/cli/types.tssrc/config/global-config.tssrc/prompts/schema.tssrc/providers/capability-provider-resolver.tssrc/providers/model-capability.tstests/agent/agent-executor.test.tstests/agent/types-contract.test.tstests/orchestrator-agent-output.test.tstests/prompt-schema.test.tstests/provider-factory.test.ts
✅ Files skipped from review due to trivial changes (2)
- CONFIGURATION.md
- README.md
🚧 Files skipped from review as they are similar to previous changes (5)
- src/cli/types.ts
- src/cli/commands.ts
- tests/agent/types-contract.test.ts
- tests/provider-factory.test.ts
- src/prompts/schema.ts
| # In agent mode, the top-level loop uses the default provider, | ||
| # lint uses the default provider unless a call passes model, | ||
| # and delegated sub-agents do the same. |
There was a problem hiding this comment.
Update the agent-mode defaults in this template comment.
This text conflicts with the behavior described in this PR: top-level agent work should default to high-capability, bundled lint should default to mid-capability, and delegated agent calls should default to high-capability when model is omitted. Keeping this as-is can mislead users configuring tiers.
Suggested wording fix
-# In agent mode, the top-level loop uses the default provider,
-# lint uses the default provider unless a call passes model,
-# and delegated sub-agents do the same.
+# In agent mode, the top-level loop defaults to high-capability.
+# Bundled lint calls default to mid-capability unless model is explicitly set.
+# Delegated sub-agents default to high-capability when model is omitted.📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # In agent mode, the top-level loop uses the default provider, | |
| # lint uses the default provider unless a call passes model, | |
| # and delegated sub-agents do the same. | |
| # In agent mode, the top-level loop defaults to high-capability. | |
| # Bundled lint calls default to mid-capability unless model is explicitly set. | |
| # Delegated sub-agents default to high-capability when model is omitted. |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/config/global-config.ts` around lines 18 - 20, Replace the outdated
template comment that begins with "# In agent mode, the top-level loop uses the
default provider, lint uses the default provider unless a call passes model, and
delegated sub-agents do the same." in src/config/global-config.ts so it reflects
the new defaults: top-level agent work should state it defaults to
high-capability, the bundled "lint" should state it defaults to mid-capability,
and delegated "agent" calls should state they default to high-capability when
"model" is omitted; update the wording accordingly and keep the same comment
block location and formatting.
- Extract resolvePresetsDir() into src/cli/preset-resolution.ts - Both commands.ts and validate-command.ts now import and call the shared utility - Utility handles dev mode (src/cli/ → ../../presets) and built mode (dist/cli/ → ../presets) via meta.json detection - Eliminates path divergence risk between validate and main commands across execution modes
- Centralized rule-matching logic into `src/rules/matched-prompts.ts` with a single `resolveMatchedPromptsForFile()` export - Replaced inline glob/pack/override resolution in `src/cli/file-evaluation.ts` with a call to the shared service - Replaced inline glob/pack/override resolution in `src/agent/executor.ts` (`buildFileRuleMatches`) with a call to the shared service - Eliminates divergence risk between standard and agent evaluation paths: both now use identical matching semantics and will stay in sync
- Add NoConfigurationFoundError to the errors module with code NO_CONFIGURATION_FOUND - Throw NoConfigurationFoundError from ScanPathResolver instead of bare new Error(...) - Replace string-based isNoConfigurationFoundError guard in executor.ts with instanceof check - Remove NO_CONFIGURATION_FOUND_PREFIX constant from executor.ts - Catch NoConfigurationFoundError in file-evaluation.ts and skip files gracefully
- Move printFileHeader call to after resolveMatchedPromptsForFile succeeds so orphaned headers are never printed for files with no matching config - Add tests/file-evaluation.test.ts with a unit test verifying evaluateFile returns zeroed metrics (no throw) when the file matches no scanPaths entry
- Assert tokenUsage.totalInputTokens and totalOutputTokens are both 0 on the zeroed-metrics result - Add afterEach(() => vi.restoreAllMocks()) to clean up console spies between tests
- Extract IssueSink interface + 4 format-specific implementations (LineIssueSink, ValeJsonIssueSink, StructuredIssueSink, JsonIssueSink) into result-routing/issue-sink.ts; createIssueSink() factory owns all format branching so routing logic stays format-agnostic - Extract check evaluation routing into result-routing/check-routing.ts; handles violation grouping by criterion, score calculation, and error tracking for check-mode prompts - Extract judge evaluation routing into result-routing/judge-routing.ts; handles per-criterion extraction, score normalization, target validation, and EvaluationScore accumulation for judge-mode prompts - Extract debug artifact writing into result-routing/debug-artifact.ts; writeCheckRoutingDebugArtifact and writeJudgeRoutingDebugArtifact are now isolated from the routing logic - result-routing.ts reduced to ~50-line coordinator: creates sink, guards on isJudgeResult, delegates to routeCheckResult or routeJudgeResult - issue-output.ts is now a thin helper: buildRuleName, getViolationFilterResults, locateAndReportViolations — all format-agnostic, delegating to IssueSink - Reduces cognitive complexity of the original monolithic result-routing file; format branching is isolated to a single createIssueSink() factory
- Construct IssueSink once per agent evaluation run in evaluateFilesInAgentMode - Pass sink into reportAgentFinding instead of raw outputFormat/jsonFormatter - Remove reportIssue() wrapper from issue-output.ts (no remaining call sites) - Remove toSinkIssue() private helper (only served reportIssue) - Remove ReportIssueParams interface from types.ts (no longer referenced) - issue-output.ts is now a thin helper delegating to IssueSink, with no inline format branching
…types - Change StructuredIssueSink.formatter from private to protected so JsonIssueSink can access it via cast instead of redeclaring it - Remove redundant private readonly jsonFormatter in JsonIssueSink; addEvaluationScore now casts this.formatter to JsonFormatter - Remove unused interfaces from src/cli/types.ts: ProcessViolationsParams, ProcessCriterionParams, ProcessCriterionResult, ValidationParams - Drop imports that became orphaned after the removals: PromptMeta, PromptCriterionSpec, JudgeResult, Severity, ScoreComponent
- Removed src/errors/validation-errors.ts: confirmed unused via grep (0 import hits outside the file itself) - Removed src/boundaries/yaml-parser.ts: confirmed unused via grep (0 consumer hits; only reference was re-export in boundaries/index.ts, which is also removed) - Moved @types/micromatch from dependencies to devDependencies (compile-time types only, not needed at runtime) - Fixed async-without-await lint error in tests/rule-matching.test.ts introduced by a prior mock
…t params - Remove RequestBuilder class and EvalContext type; runPromptStructured now takes (systemPrompt, userMessage, schema) — callers own prompt assembly instead of delegating to an injected builder - Add src/prompts/system-prompt.ts: composeSystemPrompt() joins directive, userInstructions, and rule body with double newlines, dropping empty sections - Add StructuredPromptContext to evaluator-registry; BaseEvaluator gains buildSystemPrompt() that calls composeSystemPrompt() so the directive and user-instruction context flows into every LLM call - Pass structuredPromptContext through TechnicalAccuracyEvaluator and its nested BaseEvaluator so the claim-extraction path also benefits - Remove builder injection from provider-factory and capability-provider-resolver; VercelAIProvider constructor simplified
- findings.ts: appendInlineFinding locates quoted text, applies the violation filter, and appends inline finding events to the session store; findingsFromEvents reconstructs AgentFindingRecord list from stored events - lint-prompt.ts: buildMergedLintPrompt assembles a single merged rule body from multiple LintRuleCall entries; buildEffectiveRuleBody applies per-call reviewInstruction and context overrides - tool-handlers.ts: createToolHandlers returns all AgentToolHandlers (lint, report_finding, read_file, search_files, list_directory, search_content, agent, finalize_review) as a single factory so the executor stays thin and handlers can be tested independently - Update agent-executor tests to reflect the extracted module boundaries
There was a problem hiding this comment.
Actionable comments posted: 6
♻️ Duplicate comments (1)
src/agent/tool-handlers.ts (1)
321-325:⚠️ Potential issue | 🟠 MajorRoute bare
agent()calls through the high-capability resolver.Falling back straight to
defaultProviderskips the new capability-tier mapping onagent({ task })calls whenmodelis omitted. If that injected default ever differs from the configured high-cap provider, delegation will silently run on the wrong model, and the omission-model test will keep locking that behavior in.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/agent/tool-handlers.ts` around lines 321 - 325, The handler currently falls back to defaultProvider when parsed.model is absent, bypassing the capability-tier resolver; change agentToolHandler so it always uses resolveCapabilityProvider for determining the provider (i.e., remove the direct defaultProvider fallback and call resolveCapabilityProvider with the parsed.model even when it's undefined/omitted) so bare agent({ task }) calls are routed through the high-capability resolver rather than directly to defaultProvider; update any related usage of AGENT_TOOL_INPUT_SCHEMA.parse and the provider selection logic accordingly.
🧹 Nitpick comments (10)
src/cli/preset-resolution.ts (1)
10-21: Extract shared path/meta literals into constantsThese literals represent shared runtime concepts (
'../presets','../../presets','meta.json'). Defining constants here avoids drift and makes future resolver changes safer.♻️ Suggested refactor
export function resolvePresetsDir(dir: string): string { - const buildPath = path.resolve(dir, '../presets'); - if (existsSync(path.join(buildPath, 'meta.json'))) { + const BUILD_PRESETS_RELATIVE = '../presets'; + const DEV_PRESETS_RELATIVE = '../../presets'; + const PRESETS_META_FILE = 'meta.json'; + + const buildPath = path.resolve(dir, BUILD_PRESETS_RELATIVE); + if (existsSync(path.join(buildPath, PRESETS_META_FILE))) { return buildPath; } // Dev mode fallback: src/cli/ → ../../presets - const devPath = path.resolve(dir, '../../presets'); - if (existsSync(path.join(devPath, 'meta.json'))) { + const devPath = path.resolve(dir, DEV_PRESETS_RELATIVE); + if (existsSync(path.join(devPath, PRESETS_META_FILE))) { return devPath; }As per coding guidelines: "Define shared domain constants, enums, or types for core runtime concepts; avoid magic strings in shared runtime code."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/cli/preset-resolution.ts` around lines 10 - 21, Extract the magic strings into module-level constants (e.g., PRESETS_BUILD_RELATIVE = '../presets', PRESETS_DEV_RELATIVE = '../../presets', PRESETS_META_FILENAME = 'meta.json') and replace uses of the literals in the preset-resolution logic (references: buildPath, devPath and the existsSync(path.join(..., 'meta.json')) checks) to use those constants; ensure the resolved buildPath and devPath are computed using the new constants and that the thrown Error message remains descriptive (you may interpolate the constants instead of raw strings).tests/preset-resolution.test.ts (1)
47-55: Add an explicit precedence test fordist/cliwhen both candidates existCurrent coverage validates fallback, but not the “first candidate wins” contract when both
dist/presets/meta.jsonandpresets/meta.jsonare present. A targeted case will prevent future regression in candidate ordering.🧪 Suggested test addition
+ it('prefers dist/presets when both dist and root presets exist from dist/cli', () => { + const root = setupTree({ + 'dist/presets/meta.json': JSON.stringify({ presets: { dist: true } }), + 'presets/meta.json': JSON.stringify({ presets: { root: true } }), + }); + + const resolved = resolvePresetsDir(path.join(root, 'dist', 'cli')); + + expect(resolved).toBe(path.join(root, 'dist', 'presets')); + });🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/preset-resolution.test.ts` around lines 47 - 55, Add a test that asserts candidate precedence by creating both 'dist/presets/meta.json' and 'presets/meta.json' in the test tree, then call resolvePresetsDir with a starting path under 'dist/cli' and expect the function to return the 'dist/presets' path (i.e., first candidate wins). Use the same helpers used elsewhere in the file (setupTree, resolvePresetsDir, path.join) to construct the tree and assert the expected resolved directory.tests/provider-factory.test.ts (1)
30-33: Utility helper for test assertions.The
getProviderModelNamehelper safely navigates the nested provider config structure. Consider adding a brief inline comment documenting why this specific path (config.model.model) is expected, to help future maintainers understand the provider structure.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/provider-factory.test.ts` around lines 30 - 33, The helper getProviderModelName safely returns provider.config?.model?.model but lacks context; add a one-line inline comment above the function stating that provider objects store model identifier under config.model.model (e.g., config holds provider settings and model is an object with a model string) so future maintainers understand why the nested path is used; keep the comment concise and colocated with the getProviderModelName declaration.src/cli/result-routing/debug-artifact.ts (2)
34-34: Each artifact gets a unique runId.A new UUID is generated per
writeArtifactcall, meaning each file in a single CLI run gets a separaterunId. If correlation across files in the same run is needed for debugging, consider passing a sharedrunIdfrom the caller.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/cli/result-routing/debug-artifact.ts` at line 34, The current debug-artifact implementation generates a new runId via randomUUID() inside the module, so every call to writeArtifact gets a distinct runId; change writeArtifact (and its callers) to accept an optional runId parameter (or accept a shared runId from the higher-level CLI entrypoint) and only generate a new randomUUID() inside writeArtifact when no runId is provided; update references to the symbol writeArtifact (and any factory/constructor that creates debug artifacts) to pass a shared runId when correlation across files in a single CLI run is required.
30-49: Consider using injected logger instead of console.warn.The coding guidelines recommend routing runtime logging through an injected logger interface. While
console.warnis acceptable for debug output, using the logger abstraction would provide consistency and allow operators to control log routing.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/cli/result-routing/debug-artifact.ts` around lines 30 - 49, The writeArtifact function currently uses console.warn; change it to use the project's injected logger abstraction instead: update the writeArtifact signature in src/cli/result-routing/debug-artifact.ts to accept a Logger (or obtain the logger via the existing injection helper used elsewhere), then replace both console.warn calls with logger.warn (or logger.error for the catch if your logger has levels). Ensure you reference the same Logger interface/type used elsewhere in the codebase and propagate the new parameter to callers of writeArtifact (or use the central getLogger helper) so runtime logs are routed through the logger abstraction.src/rules/matched-prompts.ts (1)
52-53: Consider injecting ScanPathResolver for testability.A new
ScanPathResolverinstance is created on each call. While functionally correct (the resolver is stateless), injecting it as an optional parameter would improve testability and align with the codebase's dependency injection patterns.♻️ Optional: Accept resolver as parameter
export function resolveMatchedPromptsForFile(params: { filePath: string; prompts: PromptFile[]; scanPaths: FilePatternConfig[]; + resolver?: ScanPathResolver; }): MatchedPromptResolution { - const { filePath, prompts, scanPaths } = params; + const { filePath, prompts, scanPaths, resolver = new ScanPathResolver() } = params; const availablePacks = getAvailablePacks(prompts); if (scanPaths.length === 0) { return { prompts: [...prompts], packs: availablePacks, overrides: {}, }; } - const resolver = new ScanPathResolver(); const resolution = resolver.resolveConfiguration(filePath, scanPaths, availablePacks);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/rules/matched-prompts.ts` around lines 52 - 53, The code instantiates a new ScanPathResolver inside the function; change the function that contains "const resolver = new ScanPathResolver();" to accept an optional ScanPathResolver parameter (e.g., resolver?: ScanPathResolver) and default it to new ScanPathResolver() when undefined, then replace the direct instantiation with the parameter and call resolver.resolveConfiguration(filePath, scanPaths, availablePacks); update any callers to pass a mocked or real resolver in tests as needed so resolveConfiguration remains injectable for unit tests.tests/result-routing/judge-routing.test.ts (1)
123-124: Usetry/finallyfor console spy restoration.If an assertion fails before Line 157, console mocks can leak into later tests.
Safer spy lifecycle pattern
it('reports missing targets without relying on formatter branches', () => { const sink = new RecordingSink(); const errorSpy = vi.spyOn(console, 'error').mockImplementation(() => {}); const warnSpy = vi.spyOn(console, 'warn').mockImplementation(() => {}); - - const result = routeJudgeResult({ - promptFile: makePromptFile({ - criteria: [ - { - id: 'Clarity', - name: 'Clarity', - weight: 2, - target: { regex: '^## Required$', flags: 'm', required: true, suggestion: 'Add ## Required' }, - }, - ], - }), - result: makeJudgeResult(), - content: 'No heading here', - relFile: 'doc.md', - sink, - }); - - expect(result).toMatchObject({ - errors: 1, - warnings: 0, - hadOperationalErrors: false, - hadSeverityErrors: true, - scoreEntries: [{ id: 'Default.Review.Clarity', scoreText: '0.0/10', score: 0 }], - }); - expect(sink.issues[0]).toMatchObject({ - summary: 'target not found', - ruleName: 'Default.Review.Clarity', - severity: Severity.ERROR, - suggestion: 'Add ## Required', - }); - - errorSpy.mockRestore(); - warnSpy.mockRestore(); + try { + const result = routeJudgeResult({ + promptFile: makePromptFile({ + criteria: [ + { + id: 'Clarity', + name: 'Clarity', + weight: 2, + target: { regex: '^## Required$', flags: 'm', required: true, suggestion: 'Add ## Required' }, + }, + ], + }), + result: makeJudgeResult(), + content: 'No heading here', + relFile: 'doc.md', + sink, + }); + + expect(result).toMatchObject({ + errors: 1, + warnings: 0, + hadOperationalErrors: false, + hadSeverityErrors: true, + scoreEntries: [{ id: 'Default.Review.Clarity', scoreText: '0.0/10', score: 0 }], + }); + expect(sink.issues[0]).toMatchObject({ + summary: 'target not found', + ruleName: 'Default.Review.Clarity', + severity: Severity.ERROR, + suggestion: 'Add ## Required', + }); + } finally { + errorSpy.mockRestore(); + warnSpy.mockRestore(); + } });Also applies to: 157-158
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/result-routing/judge-routing.test.ts` around lines 123 - 124, Wrap the console spy setup (vi.spyOn(console, 'error') and vi.spyOn(console, 'warn') stored in errorSpy and warnSpy) around the test execution using a try/finally so the spies are restored in finally (call errorSpy.mockRestore() and warnSpy.mockRestore()); do the same for the other console spies referenced later (the ones at the second occurrence) so failures don’t leak mocked console behavior into subsequent tests.tests/file-evaluation.test.ts (1)
16-21: Clean up temp directories to avoid test artifact buildup.Line 17 creates per-test temp dirs that are never removed. Consider deleting them in
afterEach.Suggested cleanup refactor
-import { mkdtempSync, writeFileSync } from 'fs'; +import { mkdtempSync, writeFileSync, rmSync } from 'fs'; @@ -function createTempFile(content: string): string { +function createTempFile(content: string): { dir: string; filePath: string } { const dir = mkdtempSync(path.join(tmpdir(), 'vectorlint-file-eval-')); const filePath = path.join(dir, 'input.md'); writeFileSync(filePath, content); - return filePath; + return { dir, filePath }; } @@ describe('evaluateFile', () => { + const tempDirs: string[] = []; + @@ - afterEach(() => vi.restoreAllMocks()); + afterEach(() => { + vi.restoreAllMocks(); + for (const dir of tempDirs.splice(0)) { + rmSync(dir, { recursive: true, force: true }); + } + }); @@ - const file = createTempFile('Hello world\n'); + const { dir, filePath: file } = createTempFile('Hello world\n'); + tempDirs.push(dir);Also applies to: 30-30
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/file-evaluation.test.ts` around lines 16 - 21, The tests create per-test temp dirs in createTempFile (uses mkdtempSync/tmpdir) but never remove them; update the test file to track created temp directories (e.g., push each dir returned by createTempFile into an array) and add an afterEach hook that iterates the tracked dirs and removes them with a recursive/force removal call (use fs.rmSync or equivalent), then clear the tracker; keep references to createTempFile and afterEach so the cleanup is clearly tied to the temp-dir creation.src/cli/result-routing/judge-routing.ts (1)
64-79: Route these model-contract diagnostics through the injected logger.This new shared runtime path writes directly to
console.error/console.warn, which makes embedding and test control harder than the rest of the runtime. Please thread the existing logger abstraction through this router instead.As per coding guidelines, "Route runtime logging through an injected logger interface; keep concrete logger implementations behind the abstraction".
Also applies to: 96-99
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/cli/result-routing/judge-routing.ts` around lines 64 - 79, The loops in judge-routing.ts write diagnostics directly to console.error/console.warn; instead accept or use the injected logger abstraction (e.g., a Logger parameter or existing router.logger) and replace console.error(...) with logger.error(...) and console.warn(...) with logger.warn(...), preserving the same messages and the hadErrors flag semantics for the missing-criterion check that uses expectedNormalized/returnedNormalized/expectedOriginalMap, and make the same replacements for the other occurrence around returnedNames (lines ~96-99).tests/agent/agent-executor.test.ts (1)
1238-1242: Make the isolation assertion actually cover Rule 2.These checks only prove Rule 1's
reviewInstruction/contextappear somewhere in the merged prompt. The test still passes if that text is copied into Rule 2 as well, so it doesn't really verify per-member isolation.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/agent/agent-executor.test.ts` around lines 1238 - 1242, The test currently only checks promptBodies[0] and thus doesn't verify per-member isolation; update the assertions to explicitly verify Rule 2's content in the prompt body produced for Rule 2 (e.g., check promptBodies[1] or the element corresponding to Rule 2) by asserting it contains 'Rule 2' and 'Check link targets.' and also add a negative assertion that promptBodies[1] does not contain Rule 1's unique text (like 'Required context for this review:\nEvidence from docs/glossary.md') so each rule's reviewInstruction/context is isolated; refer to the promptBodies array used in the test to locate where to add these checks.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/agent/executor.ts`:
- Around line 377-381: You computed fileRuleMatches/matchedRuleUnits via
buildFileRuleMatches and buildMatchedRuleUnits (using promptBySource and
MATCHED_RULE_UNIT_TOKEN_BUDGET) but the handler and tool-loop still expose all
ruleSources; fix by building an allowedRulesByFile map from fileRuleMatches
(keys = normalized file paths, values = set of normalized ruleSource strings
using normalizeRuleSource) and pass that map into the lint handler and into the
tool-loop prompt generation so only matched sources are advertised; update the
lint(file, rules[]) handler to immediately reject or ignore any (file,
normalizeRuleSource(rule.ruleSource)) pair not present in allowedRulesByFile
(return an error or no-op) so globally validRules cannot be applied to unmatched
files, and apply the same restriction when constructing
matchedRuleUnits/promptBySource usage in the tool-loop.
In `@src/agent/tool-handlers.ts`:
- Around line 206-226: The payload is still persisting the raw parsed.references
instead of the per-iteration, normalized values; create a sanitizedReferences
array by mapping parsed.references (or the default single-item array) to
resolved, canonical entries using resolveTargetForTopLevel(workspaceRoot,
targets, reference.file) and normalized start/end/line values (as you do for
relFile and line) and then pass that sanitizedReferences into store.append's
payload (replace ...(parsed.references ? { references: parsed.references } : {})
with the normalized references object) so persisted events match the emitted
relFile and line values.
In `@src/cli/preset-resolution.ts`:
- Line 21: Replace the raw throw new Error(...) in preset-resolution.ts with the
repository's custom error type used for not-found or resolution failures (import
the appropriate class from the project's errors module, e.g.,
NotFoundError/RepositoryError/AppError) so the thrown exception participates in
the app's error hierarchy; preserve the original message but pass buildPath and
devPath (or attach them as metadata) to the custom error constructor, and update
any imports to pull the custom error class into the file.
In `@src/cli/result-routing/debug-artifact.ts`:
- Around line 7-28: The getModelInfoFromEnv function omits the 'bedrock'
provider so Bedrock model names aren't captured; add a switch case for 'bedrock'
in getModelInfoFromEnv (alongside cases for 'openai', 'anthropic', etc.) and
assign the model name from the appropriate env var (e.g.,
process.env.BEDROCK_MODEL) to the local name variable so the returned object
includes provider and name (and tag) when LLM_PROVIDER === 'bedrock'.
In `@src/cli/result-routing/issue-sink.ts`:
- Around line 46-58: The payload construction in issue-sink.ts is adding a
non‑declared field "fix" into the JsonIssue shape (const payload) which violates
the JsonIssue contract defined in src/output/vale-json-formatter.ts; either
remove the spread that injects issue.fix from the payload (delete or stop using
...(issue.fix !== undefined ? { fix: issue.fix } : {})) to keep output inside
the declared schema, or if "fix" is required, add it end-to-end by updating the
JsonIssue type in src/output/vale-json-formatter.ts and all related formatter
code and tests to include and validate the new fix field. Ensure changes target
the JsonIssue type and the payload construction (payload) and update tests
accordingly if you choose the second path.
---
Duplicate comments:
In `@src/agent/tool-handlers.ts`:
- Around line 321-325: The handler currently falls back to defaultProvider when
parsed.model is absent, bypassing the capability-tier resolver; change
agentToolHandler so it always uses resolveCapabilityProvider for determining the
provider (i.e., remove the direct defaultProvider fallback and call
resolveCapabilityProvider with the parsed.model even when it's
undefined/omitted) so bare agent({ task }) calls are routed through the
high-capability resolver rather than directly to defaultProvider; update any
related usage of AGENT_TOOL_INPUT_SCHEMA.parse and the provider selection logic
accordingly.
---
Nitpick comments:
In `@src/cli/preset-resolution.ts`:
- Around line 10-21: Extract the magic strings into module-level constants
(e.g., PRESETS_BUILD_RELATIVE = '../presets', PRESETS_DEV_RELATIVE =
'../../presets', PRESETS_META_FILENAME = 'meta.json') and replace uses of the
literals in the preset-resolution logic (references: buildPath, devPath and the
existsSync(path.join(..., 'meta.json')) checks) to use those constants; ensure
the resolved buildPath and devPath are computed using the new constants and that
the thrown Error message remains descriptive (you may interpolate the constants
instead of raw strings).
In `@src/cli/result-routing/debug-artifact.ts`:
- Line 34: The current debug-artifact implementation generates a new runId via
randomUUID() inside the module, so every call to writeArtifact gets a distinct
runId; change writeArtifact (and its callers) to accept an optional runId
parameter (or accept a shared runId from the higher-level CLI entrypoint) and
only generate a new randomUUID() inside writeArtifact when no runId is provided;
update references to the symbol writeArtifact (and any factory/constructor that
creates debug artifacts) to pass a shared runId when correlation across files in
a single CLI run is required.
- Around line 30-49: The writeArtifact function currently uses console.warn;
change it to use the project's injected logger abstraction instead: update the
writeArtifact signature in src/cli/result-routing/debug-artifact.ts to accept a
Logger (or obtain the logger via the existing injection helper used elsewhere),
then replace both console.warn calls with logger.warn (or logger.error for the
catch if your logger has levels). Ensure you reference the same Logger
interface/type used elsewhere in the codebase and propagate the new parameter to
callers of writeArtifact (or use the central getLogger helper) so runtime logs
are routed through the logger abstraction.
In `@src/cli/result-routing/judge-routing.ts`:
- Around line 64-79: The loops in judge-routing.ts write diagnostics directly to
console.error/console.warn; instead accept or use the injected logger
abstraction (e.g., a Logger parameter or existing router.logger) and replace
console.error(...) with logger.error(...) and console.warn(...) with
logger.warn(...), preserving the same messages and the hadErrors flag semantics
for the missing-criterion check that uses
expectedNormalized/returnedNormalized/expectedOriginalMap, and make the same
replacements for the other occurrence around returnedNames (lines ~96-99).
In `@src/rules/matched-prompts.ts`:
- Around line 52-53: The code instantiates a new ScanPathResolver inside the
function; change the function that contains "const resolver = new
ScanPathResolver();" to accept an optional ScanPathResolver parameter (e.g.,
resolver?: ScanPathResolver) and default it to new ScanPathResolver() when
undefined, then replace the direct instantiation with the parameter and call
resolver.resolveConfiguration(filePath, scanPaths, availablePacks); update any
callers to pass a mocked or real resolver in tests as needed so
resolveConfiguration remains injectable for unit tests.
In `@tests/agent/agent-executor.test.ts`:
- Around line 1238-1242: The test currently only checks promptBodies[0] and thus
doesn't verify per-member isolation; update the assertions to explicitly verify
Rule 2's content in the prompt body produced for Rule 2 (e.g., check
promptBodies[1] or the element corresponding to Rule 2) by asserting it contains
'Rule 2' and 'Check link targets.' and also add a negative assertion that
promptBodies[1] does not contain Rule 1's unique text (like 'Required context
for this review:\nEvidence from docs/glossary.md') so each rule's
reviewInstruction/context is isolated; refer to the promptBodies array used in
the test to locate where to add these checks.
In `@tests/file-evaluation.test.ts`:
- Around line 16-21: The tests create per-test temp dirs in createTempFile (uses
mkdtempSync/tmpdir) but never remove them; update the test file to track created
temp directories (e.g., push each dir returned by createTempFile into an array)
and add an afterEach hook that iterates the tracked dirs and removes them with a
recursive/force removal call (use fs.rmSync or equivalent), then clear the
tracker; keep references to createTempFile and afterEach so the cleanup is
clearly tied to the temp-dir creation.
In `@tests/preset-resolution.test.ts`:
- Around line 47-55: Add a test that asserts candidate precedence by creating
both 'dist/presets/meta.json' and 'presets/meta.json' in the test tree, then
call resolvePresetsDir with a starting path under 'dist/cli' and expect the
function to return the 'dist/presets' path (i.e., first candidate wins). Use the
same helpers used elsewhere in the file (setupTree, resolvePresetsDir,
path.join) to construct the tree and assert the expected resolved directory.
In `@tests/provider-factory.test.ts`:
- Around line 30-33: The helper getProviderModelName safely returns
provider.config?.model?.model but lacks context; add a one-line inline comment
above the function stating that provider objects store model identifier under
config.model.model (e.g., config holds provider settings and model is an object
with a model string) so future maintainers understand why the nested path is
used; keep the comment concise and colocated with the getProviderModelName
declaration.
In `@tests/result-routing/judge-routing.test.ts`:
- Around line 123-124: Wrap the console spy setup (vi.spyOn(console, 'error')
and vi.spyOn(console, 'warn') stored in errorSpy and warnSpy) around the test
execution using a try/finally so the spies are restored in finally (call
errorSpy.mockRestore() and warnSpy.mockRestore()); do the same for the other
console spies referenced later (the ones at the second occurrence) so failures
don’t leak mocked console behavior into subsequent tests.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: fa428ebb-28d2-419c-a121-caf070bd715f
📒 Files selected for processing (45)
package.jsonsrc/agent/executor.tssrc/agent/findings.tssrc/agent/lint-prompt.tssrc/agent/tool-handlers.tssrc/boundaries/index.tssrc/boundaries/scan-path-resolver.tssrc/boundaries/yaml-parser.tssrc/cli/agent-evaluation.tssrc/cli/commands.tssrc/cli/file-evaluation.tssrc/cli/issue-output.tssrc/cli/preset-resolution.tssrc/cli/result-routing.tssrc/cli/result-routing/check-routing.tssrc/cli/result-routing/debug-artifact.tssrc/cli/result-routing/issue-sink.tssrc/cli/result-routing/judge-routing.tssrc/cli/types.tssrc/cli/validate-command.tssrc/errors/index.tssrc/errors/validation-errors.tssrc/evaluators/accuracy-evaluator.tssrc/evaluators/base-evaluator.tssrc/evaluators/evaluator-registry.tssrc/prompts/system-prompt.tssrc/providers/capability-provider-resolver.tssrc/providers/index.tssrc/providers/llm-provider.tssrc/providers/provider-factory.tssrc/providers/request-builder.tssrc/providers/vercel-ai-provider.tssrc/rules/matched-prompts.tstests/agent/agent-executor.test.tstests/file-evaluation.test.tstests/preset-resolution.test.tstests/provider-factory.test.tstests/request-builder.test.tstests/result-routing/check-routing.test.tstests/result-routing/issue-sink.test.tstests/result-routing/judge-routing.test.tstests/rule-matching.test.tstests/scoring-types.test.tstests/system-prompt.test.tstests/vercel-ai-provider.test.ts
💤 Files with no reviewable changes (6)
- src/boundaries/index.ts
- src/providers/index.ts
- src/boundaries/yaml-parser.ts
- src/providers/request-builder.ts
- tests/request-builder.test.ts
- src/errors/validation-errors.ts
✅ Files skipped from review due to trivial changes (3)
- src/providers/capability-provider-resolver.ts
- src/cli/issue-output.ts
- src/cli/agent-evaluation.ts
🚧 Files skipped from review as they are similar to previous changes (4)
- src/cli/commands.ts
- src/providers/llm-provider.ts
- src/providers/provider-factory.ts
- src/cli/file-evaluation.ts
| const references = parsed.references && parsed.references.length > 0 | ||
| ? parsed.references | ||
| : [{ file: resolveTargetForTopLevel(workspaceRoot, targets), startLine: 1, endLine: 1 }]; | ||
|
|
||
| let findingsRecorded = 0; | ||
| for (const reference of references) { | ||
| const relFile = resolveTargetForTopLevel(workspaceRoot, targets, reference.file); | ||
| onFindingRecorded(); | ||
| findingsRecorded += 1; | ||
| await store.append({ | ||
| eventType: SESSION_EVENT_TYPE.FindingRecordedTopLevel, | ||
| payload: { | ||
| file: relFile, | ||
| line: Math.max(1, Math.trunc(reference.startLine)), | ||
| column: 1, | ||
| severity: severityFromPrompt(prompt), | ||
| ruleId: buildRuleId(prompt), | ||
| ruleSource: normalizeRuleSource(parsed.ruleSource), | ||
| message: parsed.message, | ||
| ...(parsed.suggestion ? { suggestion: parsed.suggestion } : {}), | ||
| ...(parsed.references ? { references: parsed.references } : {}), |
There was a problem hiding this comment.
Persist the sanitized references, not parsed.references.
Each reference is resolved through resolveTargetForTopLevel(), but the event payload still writes the raw model-supplied array. That can leave absolute or non-canonical in-repo paths in persisted session events/operator artifacts, and those paths won't match the normalized file you emit for the same finding.
🧹 One way to normalize before persisting
- const references = parsed.references && parsed.references.length > 0
- ? parsed.references
- : [{ file: resolveTargetForTopLevel(workspaceRoot, targets), startLine: 1, endLine: 1 }];
+ const references = (parsed.references && parsed.references.length > 0
+ ? parsed.references
+ : [{ file: resolveTargetForTopLevel(workspaceRoot, targets), startLine: 1, endLine: 1 }]
+ ).map((reference) => ({
+ file: resolveTargetForTopLevel(workspaceRoot, targets, reference.file),
+ startLine: Math.max(1, Math.trunc(reference.startLine)),
+ endLine: Math.max(1, Math.trunc(reference.endLine)),
+ }));
@@
- const relFile = resolveTargetForTopLevel(workspaceRoot, targets, reference.file);
+ const relFile = reference.file;
@@
- ...(parsed.references ? { references: parsed.references } : {}),
+ references,🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/agent/tool-handlers.ts` around lines 206 - 226, The payload is still
persisting the raw parsed.references instead of the per-iteration, normalized
values; create a sanitizedReferences array by mapping parsed.references (or the
default single-item array) to resolved, canonical entries using
resolveTargetForTopLevel(workspaceRoot, targets, reference.file) and normalized
start/end/line values (as you do for relFile and line) and then pass that
sanitizedReferences into store.append's payload (replace ...(parsed.references ?
{ references: parsed.references } : {}) with the normalized references object)
so persisted events match the emitted relFile and line values.
| return devPath; | ||
| } | ||
|
|
||
| throw new Error(`Could not locate presets directory containing meta.json. Looked in ${buildPath} and ${devPath}`); |
There was a problem hiding this comment.
Use repository error types instead of raw Error
Line 21 throws new Error(...), which bypasses the custom error hierarchy used in src/ and weakens typed error handling consistency.
As per coding guidelines: "Prefer the repository's custom error hierarchy over native Error; catch blocks use unknown type and extend existing custom error types before introducing raw exceptions."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/cli/preset-resolution.ts` at line 21, Replace the raw throw new
Error(...) in preset-resolution.ts with the repository's custom error type used
for not-found or resolution failures (import the appropriate class from the
project's errors module, e.g., NotFoundError/RepositoryError/AppError) so the
thrown exception participates in the app's error hierarchy; preserve the
original message but pass buildPath and devPath (or attach them as metadata) to
the custom error constructor, and update any imports to pull the custom error
class into the file.
| function getModelInfoFromEnv(): { provider?: string; name?: string; tag?: string } { | ||
| const provider = process.env.LLM_PROVIDER; | ||
| let name: string | undefined; | ||
|
|
||
| switch (provider) { | ||
| case 'openai': | ||
| name = process.env.OPENAI_MODEL; | ||
| break; | ||
| case 'anthropic': | ||
| name = process.env.ANTHROPIC_MODEL; | ||
| break; | ||
| case 'azure-openai': | ||
| name = process.env.AZURE_OPENAI_DEPLOYMENT_NAME; | ||
| break; | ||
| case 'gemini': | ||
| name = process.env.GEMINI_MODEL; | ||
| break; | ||
| } | ||
|
|
||
| const tag = [provider, name].filter(Boolean).join('-'); | ||
| return { ...(provider && { provider }), ...(name && { name }), ...(tag && { tag }) }; | ||
| } |
There was a problem hiding this comment.
Missing Bedrock provider case in model info extraction.
The switch statement handles openai, anthropic, azure-openai, and gemini, but omits the bedrock provider. When Bedrock is configured, the model name won't be captured in debug artifacts.
🔧 Add Bedrock case
case 'gemini':
name = process.env.GEMINI_MODEL;
break;
+ case 'bedrock':
+ name = process.env.BEDROCK_MODEL;
+ break;
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| function getModelInfoFromEnv(): { provider?: string; name?: string; tag?: string } { | |
| const provider = process.env.LLM_PROVIDER; | |
| let name: string | undefined; | |
| switch (provider) { | |
| case 'openai': | |
| name = process.env.OPENAI_MODEL; | |
| break; | |
| case 'anthropic': | |
| name = process.env.ANTHROPIC_MODEL; | |
| break; | |
| case 'azure-openai': | |
| name = process.env.AZURE_OPENAI_DEPLOYMENT_NAME; | |
| break; | |
| case 'gemini': | |
| name = process.env.GEMINI_MODEL; | |
| break; | |
| } | |
| const tag = [provider, name].filter(Boolean).join('-'); | |
| return { ...(provider && { provider }), ...(name && { name }), ...(tag && { tag }) }; | |
| } | |
| function getModelInfoFromEnv(): { provider?: string; name?: string; tag?: string } { | |
| const provider = process.env.LLM_PROVIDER; | |
| let name: string | undefined; | |
| switch (provider) { | |
| case 'openai': | |
| name = process.env.OPENAI_MODEL; | |
| break; | |
| case 'anthropic': | |
| name = process.env.ANTHROPIC_MODEL; | |
| break; | |
| case 'azure-openai': | |
| name = process.env.AZURE_OPENAI_DEPLOYMENT_NAME; | |
| break; | |
| case 'gemini': | |
| name = process.env.GEMINI_MODEL; | |
| break; | |
| case 'bedrock': | |
| name = process.env.BEDROCK_MODEL; | |
| break; | |
| } | |
| const tag = [provider, name].filter(Boolean).join('-'); | |
| return { ...(provider && { provider }), ...(name && { name }), ...(tag && { tag }) }; | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/cli/result-routing/debug-artifact.ts` around lines 7 - 28, The
getModelInfoFromEnv function omits the 'bedrock' provider so Bedrock model names
aren't captured; add a switch case for 'bedrock' in getModelInfoFromEnv
(alongside cases for 'openai', 'anthropic', etc.) and assign the model name from
the appropriate env var (e.g., process.env.BEDROCK_MODEL) to the local name
variable so the returned object includes provider and name (and tag) when
LLM_PROVIDER === 'bedrock'.
| const payload: JsonIssue = { | ||
| file: issue.file, | ||
| line: issue.line, | ||
| column: issue.column, | ||
| severity: issue.severity, | ||
| message: issue.summary, | ||
| rule: issue.ruleName, | ||
| match: issue.match || '', | ||
| matchLength: issue.match ? issue.match.length : 0, | ||
| ...(issue.suggestion !== undefined ? { suggestion: issue.suggestion } : {}), | ||
| ...(issue.fix !== undefined ? { fix: issue.fix } : {}), | ||
| ...(issue.scoreText !== undefined ? { score: issue.scoreText } : {}), | ||
| }; |
There was a problem hiding this comment.
Keep Vale JSON output inside the declared formatter schema.
src/output/vale-json-formatter.ts's JsonIssue shape doesn't define fix, so adding it here can leak a field the Vale JSON formatter contract never promised. Either drop it here or add fix end-to-end in the formatter/type/tests.
✂️ Minimal fix
const payload: JsonIssue = {
file: issue.file,
line: issue.line,
column: issue.column,
severity: issue.severity,
message: issue.summary,
rule: issue.ruleName,
match: issue.match || '',
matchLength: issue.match ? issue.match.length : 0,
...(issue.suggestion !== undefined ? { suggestion: issue.suggestion } : {}),
- ...(issue.fix !== undefined ? { fix: issue.fix } : {}),
...(issue.scoreText !== undefined ? { score: issue.scoreText } : {}),
};📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const payload: JsonIssue = { | |
| file: issue.file, | |
| line: issue.line, | |
| column: issue.column, | |
| severity: issue.severity, | |
| message: issue.summary, | |
| rule: issue.ruleName, | |
| match: issue.match || '', | |
| matchLength: issue.match ? issue.match.length : 0, | |
| ...(issue.suggestion !== undefined ? { suggestion: issue.suggestion } : {}), | |
| ...(issue.fix !== undefined ? { fix: issue.fix } : {}), | |
| ...(issue.scoreText !== undefined ? { score: issue.scoreText } : {}), | |
| }; | |
| const payload: JsonIssue = { | |
| file: issue.file, | |
| line: issue.line, | |
| column: issue.column, | |
| severity: issue.severity, | |
| message: issue.summary, | |
| rule: issue.ruleName, | |
| match: issue.match || '', | |
| matchLength: issue.match ? issue.match.length : 0, | |
| ...(issue.suggestion !== undefined ? { suggestion: issue.suggestion } : {}), | |
| ...(issue.scoreText !== undefined ? { score: issue.scoreText } : {}), | |
| }; |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/cli/result-routing/issue-sink.ts` around lines 46 - 58, The payload
construction in issue-sink.ts is adding a non‑declared field "fix" into the
JsonIssue shape (const payload) which violates the JsonIssue contract defined in
src/output/vale-json-formatter.ts; either remove the spread that injects
issue.fix from the payload (delete or stop using ...(issue.fix !== undefined ? {
fix: issue.fix } : {})) to keep output inside the declared schema, or if "fix"
is required, add it end-to-end by updating the JsonIssue type in
src/output/vale-json-formatter.ts and all related formatter code and tests to
include and validate the new fix field. Ensure changes target the JsonIssue type
and the payload construction (payload) and update tests accordingly if you
choose the second path.
| if (!got) { | ||
| return { | ||
| errors: 0, | ||
| warnings: 0, | ||
| userScore: 0, | ||
| maxScore, | ||
| hadOperationalErrors, | ||
| hadSeverityErrors, | ||
| scoreEntry: { id: ruleName, scoreText: '-', score: 0.0 }, | ||
| scoreComponent: { | ||
| criterion: nameKey, | ||
| rawScore: 0, | ||
| maxScore: 4, | ||
| weightedScore: 0, | ||
| weightedMaxScore: weightNum, | ||
| normalizedScore: 0, | ||
| normalizedMaxScore: 10, | ||
| }, | ||
| }; |
There was a problem hiding this comment.
Don't turn a missing criterion into a real zero score.
validateCriteriaCompleteness() already treats an omitted expected criterion as an operational error, but this branch still returns score: 0.0 plus a zeroed scoreComponent. That makes malformed model output look like a genuine content failure and will drag downstream score aggregates down incorrectly.
Possible direction
if (!got) {
return {
errors: 0,
warnings: 0,
userScore: 0,
maxScore,
- hadOperationalErrors,
+ hadOperationalErrors: true,
hadSeverityErrors,
- scoreEntry: { id: ruleName, scoreText: '-', score: 0.0 },
- scoreComponent: {
- criterion: nameKey,
- rawScore: 0,
- maxScore: 4,
- weightedScore: 0,
- weightedMaxScore: weightNum,
- normalizedScore: 0,
- normalizedMaxScore: 10,
- },
+ scoreEntry: { id: ruleName, scoreText: '-' },
};
}- agent tool now uses resolveCapabilityProvider('high-cap') when no
model tier is specified, rather than falling back to defaultProvider
- remove 10.0/10 score for rules that never matched any file —
unmatched rules are now omitted from agent score output entirely
- make criterion name lookup case-insensitive in check routing
- use ConfigError for missing criteria in specialized evaluators
- pack-qualify allScores key to prevent cross-pack name collisions
- filter empty packName in buildRuleName to prevent leading-dot names
- update global config comment to reflect new sub-agent defaults
- move Gemini/Bedrock env-parser test to its own describe block
Implementation Artifact1) Outcome and scopeGoal: Ship agent-mode capability tiers, deterministic rule-to-file Non-goals: No new evaluator types, no rule file format changes, no 2) Capability changesAfter merge, you can:
Changed behavior:
3) Contract changesNew env vars (all optional, all per-provider):
Agent tool contracts (unchanged shape, new
Internal interface change (not user-facing):
No migrations. No schema changes to rule YAML files. 4) High-level designAgent evaluation flow:
CLI evaluation pipeline (refactored, no behavior change):
5) Risk and safeguardsMain risks:
Safeguards:
Known limitations:
6) How to verifyHappy path checks:
Edge case checks:
Checks actually run:
7) Rollout and rollbackRollout:
Rollback:
|
…/rules/
- Rename PromptFile/PromptMeta/PromptCriterionSpec → RuleFile/RuleMeta/RuleCriterionSpec
- Rename .body → .content on the rule file schema
- Move src/prompts/prompt-loader.ts → src/rules/rule-loader.ts
- Move src/prompts/prompt-validator.ts → src/rules/rule-validator.ts
- Rename src/rules/matched-prompts.ts → src/rules/matched-rules.ts
- Rename prompt-schemas.ts → rule-schemas.ts; update schemas/index.ts
- Update loadRuleFile() return key: { prompt } → { rule }
- Update loadRules() return key: { prompts } → { rules }
- Update validatePrompt() → validateRule(), validateAll() param type
- Update resolveMatchedPromptsForFile() → resolveMatchedRulesForFile()
and MatchedPromptResolution.prompts → MatchedRuleResolution.rules
- Rename BaseEvaluator.this.prompt → this.rule; update accuracy-evaluator
- Rename EvaluationOptions.prompts → rules across CLI and agent layers
- Rename RunAgentExecutorParams.prompts → rules; promptBySource → ruleBySource
- Rename LintRuleCall.prompt → rule field
- Update all import paths across ~18 source files
- Update all test files to import RuleFile/RuleMeta from
src/rules/rule-loader and src/schemas/rule-schemas
- Rename .body → .content and { prompts } → { rules } in
all test fixtures and option objects
- Rename validatePrompt → validateRule and
resolveMatchedPromptsForFile → resolveMatchedRulesForFile
at call sites
- Rename GateViolationLike → FilterableViolation in
violation-filter.ts for domain-accurate naming
…builder - Move MERGED_LINT_REVIEW_INSTRUCTIONS out of lint-prompt.ts into a new buildLintSystemPrompt() in prompt-builder.ts; lint-prompt.ts now only handles rule-content merging - Extract inline sub-agent system prompt from sub-agent.ts into buildSubAgentSystemPrompt() in prompt-builder.ts - prompt-builder.ts is now the single file to edit for any agent system prompt change
- Alias InlineFindingEventPayload = AgentFindingRecord in findings.ts, eliminating a structurally identical duplicate type - Move estimateTokens out of src/utils/ (top-level bucket) into src/agent/utils.ts; inline the one-liner directly in boundaries/ user-instruction-loader.ts to keep cross-module imports clean; delete src/utils/ - Trim README agent mode section: remove internal implementation details (rule unit precomputation, capability tier fallback chains) that are not user-facing - Add clarifying comment to makePromptVariant test factory - Add three coding convention rules to CLAUDE.md: no top-level src/utils/, centralized prompt construction, self-documenting names
…counting - Add src/tokenizer.ts as a focused module exporting resolveTokenizer() and estimateTokens(); provider→encoding map mirrors provider-factory.ts (anthropic/bedrock→claude, openai/azure/gemini→o200k_base); exhaustive switch with no fallback — LLM_PROVIDER is validated at startup so an unknown value here is always a bug - Delete src/agent/utils.ts; the single estimateTokens heuristic moves to src/tokenizer.ts with a TokenCounter interface so callers and tests stay decoupled from the concrete Tokenizer class - Thread TokenCounter through buildMatchedRuleUnits and estimateMatchedRuleUnitTokens; executor creates tokenizer once and passes it down - user-instruction-loader: drop inline heuristic, import from tokenizer - Tests: set LLM_PROVIDER=anthropic in beforeAll for test files that call runAgentExecutor; rule-units tests use a heuristic mock counter to keep budget/content assertions stable
- Move Severity enum definition from evaluators/types to rule-schemas (where it belongs as a rule/finding concept) - Remove the evaluator field from RULE_META_SCHEMA (evaluator abstraction being removed) - Update RULE_META_SCHEMA comment to remove references to evaluator types - Severity is now defined in rule-schemas.ts and imported by other modules Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…scorer
- src/prompts/schema.ts: import ReviewType from lint/types; update JudgeResult,
CheckResult, RawCheckResult discriminant types and isJudgeResult/isCheckResult
guard comparisons
- src/scoring/scorer.ts: import ReviewType from lint/types; update type field
assignments in calculateCheckScore, calculateJudgeScore, averageJudgeScores
- src/evaluators/base-evaluator.ts: add ReviewType import; update RawCheckResult
literal construction (line 232) to use ReviewType.CHECK
- src/evaluators/accuracy-evaluator.ts: add ReviewType import; update RawCheckResult
literal construction to use ReviewType.CHECK
String values unchanged ('judge'/'check'); purely a nominal rename at the type level.
- Add runLint() — plain async function porting BaseEvaluator logic (judge + check modes, chunking, usage aggregation) - Add runLintMerged() — extracts LLM-call logic from lintToolHandler for multi-rule merged evaluation - Export LintRuleCall, MergedLintResult, RunLintParams, RunLintMergedParams types - Export resolveRuleContent, mergeRulesForLint, buildLintSystemPrompt (moved from agent/lint-prompt.ts and agent/prompt-builder.ts) - Uses ReviewType (from ./types) and Severity (from ../schemas/rule-schemas) throughout - computeFilterDecision sourced from ./violation-filter (not evaluators) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… MergedLintResult type
- Remove `defaultSeverity` from `RunLintParams.options` — scoring happens
downstream in the routing layer (check-routing.ts), not inside
`runCheckEvaluation`, so the field was unreachable dead code
- Remove `export type { Severity }` re-export — Severity is already
exported from `src/schemas/rule-schemas.ts`; a second path causes confusion
- Replace inline anonymous `findings` type in `MergedLintResult` with
`MergedCheckLLMResult["findings"]` to derive from the schema source of truth
- Drop the now-unused `Severity` import
- file-evaluation.ts: replace createEvaluator/evaluate with runLint(); drop accuracy-evaluator-specific criteria validation block; remove ConfigError, Type, and searchProvider from imports/destructuring - tool-handlers.ts: replace inline composeSystemPrompt + runPromptStructured call with runLintMerged(); import LintRuleCall from src/lint instead of lint-prompt; remove buildMergedCheckLLMSchema, MergedCheckLLMResult, composeSystemPrompt, and buildLintSystemPrompt imports - prompt-builder.ts: remove buildLintSystemPrompt (moved to src/lint/), remove mergeRulesForLint and LintRuleCall imports from lint-prompt - src/index.ts: remove evaluator bootstrap side-effect import Bundle size: 215 KB → 201 KB (evaluator registry no longer loaded at startup) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…luators - Severity: all consumers now import from schemas/rule-schemas - computeFilterDecision/FilterDecision: all consumers now import from lint/violation-filter - EvaluationType → ReviewType in 3 test files (import + usages) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Delete src/evaluators/ (base-evaluator, registry, accuracy-evaluator, all support files) - Delete src/agent/lint-prompt.ts (content already in src/lint/) - Delete src/debug/violation-filter.ts (dead re-export barrel) - Fix src/agent/executor.ts: import resolveRuleContent from ../lint (not ./lint-prompt) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Remove evaluator: base from all 10 preset and .github rule files - Delete .github/rules/TinyRocket/technical-accuracy.md (accuracy evaluator removed) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- orchestrator-filtering: mock runLint from ../lint instead of createEvaluator - rule-matching: spy on runLint instead of createEvaluator - scoring-types: rewrite to use runLint; drop TechnicalAccuracyEvaluator block - prompt-loader-validation: remove evaluator: field from test YAML and assertions - preset-resolution, agent-executor: bump slow-import test timeouts to 30s Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Why
Agent mode previously ran one LLM call per rule with no awareness of
which model to use for different task types, and it had no way to
delegate sub-tasks to specialized agents. This PR adds capability-tier
routing, deterministic pre-computed rule units, merged lint calls, and
sub-agent delegation to make agent evaluation faster and more capable.
What
New agent capabilities:
high-cap/mid-cap/low-cap): eachprovider now exposes per-tier model/deployment config; callers request
a tier and the resolver picks the configured model with upward fallback
before the agent loop so each
linttool call only receives the rulesapplicable to the target file — no hallucinated rule sources
into one structured LLM call instead of N separate calls
agenttool in the executor dispatchesa restricted read-only sub-agent (no lint/report tools); defaults to
high-capwhen no model tier is specifiedcomposeSystemPrompt()assemblesdirective + userInstructions + rule body with consistent double-newline
separation, used by both evaluators and the lint tool handler
Provider layer:
RequestBuilder/EvalContext;runPromptStructurednowtakes explicit
(systemPrompt, userMessage, schema)paramsStructuredPromptContextcarriessystemDirectiveanduserInstructionsinto the evaluator pipelineCLI refactors (no behavior change):
resolvePresetsDir()fixes dev/dist path divergence invalidatevs main commandsresolveMatchedPromptsForFile()unifies rule matching acrossstandard and agent evaluation paths
result-routing.tssplit intocheck-routing,judge-routing,debug-artifact, andissue-sinkmodules;IssueSinkinterfaceisolates all format branching from routing logic
NoConfigurationFoundErrorreplaces bareErrorfor missingscan-path config
Correctness fixes:
high-capby default (was: fell back to defaultprovider, ignoring tier config)
even when never applied to any file)
buildRuleNamefilters empty pack segments (was: emitted.RuleIdwith leading dot when pack was empty)
when LLM returned different casing)
collisions
Scope
In scope
RequestBuilderremoval, tier-aware factoryOut of scope
Behavior impact
and the agent can spawn read-only sub-agents for document exploration
recognized (e.g.,
OPENAI_HIGH_CAPABILITY_MODEL,OPENAI_LOW_CAPABILITY_MODEL)in the per-rule score summary
(e.g.,
RuleIdinstead of.RuleId)unchanged; the refactors are path-equivalent rewrites
Risk
but fewer calls total; output schema is validated against Zod so
malformed responses still surface as operational errors
agenttool; if the LLM nevercalls it, behavior is identical to before
RequestBuilderremoval touches all providers; the simplifiedinterface is tested via the provider contract test suite
How to test / verify
Checks run
npm run test:run— 346 tests, all passingnpm run lint— zero errorsManual verification
LLM_PROVIDER=openai+OPENAI_HIGH_CAPABILITY_MODEL=gpt-4.1andrun agent mode; confirm the
agentsub-tool usesgpt-4.1rules do not appear in score output
vectorlint validatein a project using dev (tsx) and built(
dist) modes; confirm preset resolution succeeds in bothRollback
feat(agent): add provider capability tiers