feat: DataAPICLIErgonomics — caching, REPL, help, dry-run, metadata#35
feat: DataAPICLIErgonomics — caching, REPL, help, dry-run, metadata#35darko-mijic merged 3 commits intomainfrom
Conversation
…metadata Complete all 6 deliverables for Phase 25d DataAPICLIErgonomics: 1. MasterDataset cache (dataset-cache.ts) — sha256 mtime-based invalidation, drops repeated queries from 2-5s to <200ms. Stores at node_modules/.cache/delivery-process/dataset.json with atomic writes. 2. REPL mode (repl.ts) — interactive multi-query sessions with pipeline loaded once. Supports reload, help, quit. Dual mode: TTY (interactive prompts) and piped (batch processing for tests). 3. Per-subcommand help — `process-api context --help` shows usage, flags, and examples from CLI_SCHEMA.commandNarratives. 4. Dry-run mode — `--dry-run` resolves globs and shows pipeline scope without running scan/extract/transform. 5. Validation summary in metadata — QueryMetadataExtra enriches every response with validation counts, cache status, and pipeline timing. 6. FSM short-circuit (previously complete) — static FSM queries skip the full pipeline. Tests: 63 new tests across 5 feature files (cache, metadata, help, dry-run, REPL). Full suite: 8256 tests, 0 regressions. Unblocks MCPServerIntegration (Phase 46).
📝 WalkthroughWalkthroughThis PR implements CLI ergonomics: dataset caching with mtime-based invalidation, a dry-run mode that reports pipeline scope, per-subcommand help, an interactive REPL for multi-query sessions, and enriched JSON response metadata (validation summary, cache info, pipeline timing). Changes
Sequence DiagramssequenceDiagram
actor User
participant CLI as Process API CLI
participant Cache as Cache Module
participant FS as FileSystem
participant Pipeline as Pipeline Executor
User->>CLI: invoke process-api status --opts
CLI->>Cache: computeCacheKey(opts)
Cache->>FS: stat input/features/config files
FS-->>Cache: mtimes
Cache-->>CLI: cacheKey
CLI->>Cache: tryLoadCache(cacheKey)
Cache->>FS: read cache file
alt cache hit
FS-->>Cache: cached payload
Cache->>Cache: rehydrate runtime dataset
Cache-->>CLI: CachedResult (hit=true, age, dataset)
CLI->>CLI: buildQueryMetadataExtra(cache hit, age)
CLI-->>User: JSON response + metadata
else cache miss
FS-->>Cache: miss/error
Cache-->>CLI: undefined
CLI->>Pipeline: execute pipeline
Pipeline-->>CLI: PipelineResult (dataset, validation)
CLI->>Cache: writeCache(result, cacheKey)
Cache->>FS: write cache file (atomic)
CLI->>CLI: buildQueryMetadataExtra(hit=false, pipelineMs)
CLI-->>User: JSON response + metadata
end
sequenceDiagram
actor User
participant CLI as Process API CLI
participant REPL as REPL Module
participant Pipeline as Pipeline Executor
participant Output as Formatter
User->>CLI: process-api repl
CLI->>REPL: startRepl(opts)
REPL->>Pipeline: load pipeline once
Pipeline-->>REPL: RuntimeMasterDataset + ValidationSummary
loop multi-query session
User->>REPL: command (status/list/pattern/reload/quit)
alt reload
REPL->>Pipeline: reload from sources
Pipeline-->>REPL: updated dataset
REPL->>Output: format "Reloaded" message
else query
REPL->>REPL: dispatch command
REPL->>Output: format result
end
Output-->>REPL: formatted output
REPL-->>User: stdout/stderr
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 9
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/cli/dataset-cache.ts`:
- Around line 136-149: Replace the blind JSON.parse of the cached file with a
runtime-validated parse using a Zod schema: define (or import) a
CachedPipelineData Zod schema in src/validation-schemas and after reading raw
from cachePath run the raw string through the schema (not a plain TypeScript
cast) and if validation fails return undefined; keep the existing version/key
checks against CACHE_VERSION and cacheKey and then proceed to reconstruct the
RuntimeMasterDataset and call createLoadedWorkflow only when the validated
payload is present and valid so corrupt/partial cache files result in a cache
miss instead of rehydration errors.
- Around line 110-118: computeCacheKey currently only stats
delivery-process.config.ts which misses projects using
delivery-process.config.js; update the logic that builds fileMtimes (where
configPath and configStat are used) to check for both delivery-process.config.ts
and delivery-process.config.js (or reuse the same resolution used by the config
loader) and push the mtime or 'absent' sentinel for the actual file found (or
both filenames) so edits to either config affect the cache fingerprint.
- Around line 202-206: The current write uses a fixed tmpPath
(`${cachePath}.tmp`) which breaks atomicity across concurrent runs; change the
temp file to be unique per writer (e.g., include process id, timestamp, or a
crypto-random suffix) before calling fsp.writeFile and then fsp.rename into
cachePath so each process writes to its own tmp file and the final rename
remains atomic; update the usage around cachePath/tmpPath in the function
handling the write (the logic that calls fsp.writeFile and fsp.rename with
cacheData) to generate and use that unique tmp filename.
In `@src/cli/repl.ts`:
- Around line 133-145: The depth parsed from subArgs via parseInt (depthVal →
depth) is not validated and may be NaN before being passed to buildDepTree;
update the parsing logic in repl.ts to validate depthVal after parseInt (use
Number.isInteger or Number.isNaN) and only assign depth if the parsed value is a
finite integer within an acceptable range (e.g., >=0), otherwise keep the
default depth (3) or surface a clear error to the user; ensure the validated
depth is what gets passed to buildDepTree({ pattern: patternArg, maxDepth:
depth, ... }).
In `@tests/features/cli/data-api-dryrun.feature`:
- Around line 19-28: The scenario "Dry-run shows file counts" only asserts the
DRY RUN marker, file counts, and absence of "success" but misses checks for
config and cache reporting; update this feature
(tests/features/cli/data-api-dryrun.feature) to assert that the dry-run stdout
also contains the config status and cache status strings produced by the CLI
(e.g., phrases like "config status" or the exact config detection message
emitted by process-api and the cache state message), keeping the existing exit
code, marker, file-count, and "does not contain 'success'" assertions intact.
In `@tests/features/cli/data-api-help.feature`:
- Around line 19-26: The feature test for per-subcommand help is missing an
assertion that examples are rendered; update the "Per-subcommand help for
context" scenario that runs "process-api context --help" to also assert that
stdout contains the example text for the context subcommand (pull the expected
example string from CLI_SCHEMA.commandNarratives for the "context" command or
assert presence of the word "Examples:" and at least one example line). Ensure
the step references the existing scenario/step ("When running \"process-api
context --help\"") and fails if examples are not present so the contract that
examples appear in per-subcommand help is enforced.
In `@tests/features/cli/data-api-metadata.feature`:
- Around line 22-29: Add the required `@acceptance-criteria` tag to at least one
scenario (e.g., the "Scenario: Validation summary in response metadata") so the
spec meets the guideline; update the scenario's annotation line to include
`@acceptance-criteria` (or add a new scenario with that tag) ensuring the rest of
the steps remain unchanged.
In `@tests/steps/cli/data-api-metadata.steps.ts`:
- Around line 91-115: The test currently only checks that metadata.validation.*
fields are numbers; modify the scenario to seed a deterministic validation
failure via the existing helper writePatternFiles(state) (or add a new fixture
writer called from the Given step) so the pipeline produces known counts (e.g.,
one dangling reference and one malformed pattern), then replace the loose typeof
assertions in the And step that uses
parseResponseMetadata(getResult(state).stdout) with exact equality checks
(expect(metadata.validation!.danglingReferenceCount).toBe(1),
expect(metadata.validation!.malformedPatternCount).toBe(1), etc.) to prove the
counts come from the pipeline; update or extend writePatternFiles (or create a
new helper invoked by the Given 'TypeScript files with pattern annotations'
step) to write a TypeScript file containing a deliberately malformed pattern and
a pattern that references a missing symbol so the expected numeric counts are
deterministic.
In `@tests/support/helpers/cli-runner.ts`:
- Around line 113-117: The stdin write block that does "if (stdinData !==
undefined) { child.stdin.write(stdinData); child.stdin.end(); }" can emit an
unhandled EPIPE when the child exits early; add an error handler on child.stdin
(e.g., child.stdin.on('error', err => { if (err.code === 'EPIPE') return; /*
optionally log other errors */ })) before calling child.stdin.write/end so
stream errors are swallowed or handled and do not crash the test helper while
still allowing the child process error path to be returned by the existing
child.on('error') handler.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: b711b532-2849-46af-adda-955fb8d84a96
⛔ Files ignored due to path filters (10)
docs-live/ARCHITECTURE.mdis excluded by none and included by nonedocs-live/BUSINESS-RULES.mdis excluded by none and included by nonedocs-live/CHANGELOG-GENERATED.mdis excluded by none and included by nonedocs-live/INDEX.mdis excluded by none and included by nonedocs-live/PRODUCT-AREAS.mdis excluded by none and included by nonedocs-live/_claude-md/core-types/core-types-overview.mdis excluded by none and included by nonedocs-live/business-rules/data-api.mdis excluded by none and included by nonedocs-live/product-areas/CORE-TYPES.mdis excluded by none and included by nonedocs-live/product-areas/DATA-API.mdis excluded by none and included by nonedocs-live/product-areas/GENERATION.mdis excluded by none and included by none
📒 Files selected for processing (16)
delivery-process/specs/data-api-cli-ergonomics.featuresrc/api/types.tssrc/cli/dataset-cache.tssrc/cli/process-api.tssrc/cli/repl.tstests/features/cli/data-api-cache.featuretests/features/cli/data-api-dryrun.featuretests/features/cli/data-api-help.featuretests/features/cli/data-api-metadata.featuretests/features/cli/data-api-repl.featuretests/steps/cli/data-api-cache.steps.tstests/steps/cli/data-api-dryrun.steps.tstests/steps/cli/data-api-help.steps.tstests/steps/cli/data-api-metadata.steps.tstests/steps/cli/data-api-repl.steps.tstests/support/helpers/cli-runner.ts
… test coverage - Include both .ts and .js config files in cache fingerprint (comment 001) - Use unique temp filename (pid+timestamp) for atomic cache writes (comment 003) - Validate parseInt result for --depth flag in REPL (comment 004) - Add config/cache status assertions to dry-run test (comment 005) - Add Usage: assertion to per-subcommand help test (comment 006) - Add @acceptance-criteria tag to metadata test (comment 007) - Add EPIPE error handler on child.stdin in CLI test runner (comment 009)
There was a problem hiding this comment.
Actionable comments posted: 8
♻️ Duplicate comments (2)
src/cli/dataset-cache.ts (1)
40-53:⚠️ Potential issue | 🟠 MajorAdd Zod schema for
CachedPipelineDatato validate cache payload at runtime.These TypeScript interfaces are used for unsafe type assertions on disk data (line 138). A corrupt or partially written cache file could pass version/key checks but fail later when accessing expected fields.
Define a Zod schema in
src/validation-schemas/and use.safeParse()intryLoadCache()to return a cache miss on validation failure. This aligns with DD-4 (never throw) while providing structural validation.,
As per coding guidelines, "Define types using Zod schemas in
src/validation-schemas/with runtime validation; do NOT use plain TypeScript interfaces for validation."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/cli/dataset-cache.ts` around lines 40 - 53, Add a Zod schema mirroring CacheMetadata and CachedPipelineData in src/validation-schemas/ (include fields: key, timestamp, version, dataset, workflowConfig, validation, warnings, scanMetadata) and export its TypeScript type; then in tryLoadCache() replace the unsafe type assertion with schema.safeParse(parsedFile) and treat a failed parse as a cache miss (do not throw) so the function returns null/undefined per existing cache-miss semantics; ensure you reference the schema when parsing and keep version/key checks but only use the parsed value after safeParse succeeds.tests/features/cli/data-api-help.feature (1)
19-20:⚠️ Potential issue | 🟠 MajorThe spec still doesn't enforce example rendering.
The Rule says per-subcommand help includes examples, but the
context --helpscenario only asserts usage and flags. If example output drops out ofCLI_SCHEMA.commandNarratives, this feature still passes.Also applies to: 22-28
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/features/cli/data-api-help.feature` around lines 19 - 20, The feature test currently checks only usage and flags for the "context --help" scenario but doesn't assert that per-subcommand examples are rendered; update the scenario(s) in tests/features/cli/data-api-help.feature (including the "context --help" scenario and lines 22-28) to assert that the help output contains the example text defined in CLI_SCHEMA.commandNarratives for that command, e.g., extract the expected example string from CLI_SCHEMA.commandNarratives[<command>].example (or the appropriate key) and add an assertion that the CLI output includes that example so the spec enforces example rendering.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/cli/dataset-cache.ts`:
- Around line 203-210: The atomic write is fine but orphaned temp files may
accumulate if fsp.rename fails; update the try/catch around the write/rename in
the function that defines cachePath and tmpPath so the catch attempts to remove
the tmpPath (use fsp.unlink or equivalent) and ignore ENOENT errors, ensuring
cleanup of the `${cachePath}.${process.pid}.${Date.now()}.tmp` file created by
fsp.writeFile; keep the overall non-fatal behavior (do not rethrow) if cleanup
fails.
- Around line 83-92: The cache key generation omits PipelineOptions that affect
dataset output; update the JSON passed to hash.update (the block that currently
includes input, features, baseDir, workflowPath, mergeConflictStrategy,
includeValidation) to also include opts.exclude and opts.contextInferenceRules
(normalizing defaults such as null or empty arrays/objects) so changes to
excluded files or context inference rules invalidate the cache; locate the same
JSON object where hash.update(...) is called and add those two properties.
- Around line 144-150: The current code mutates the deserialized dataset via
Object.assign(dataset, { workflow }) which bypasses the readonly
RuntimeMasterDataset.workflow; instead construct a new immutable object
combining the parsed dataset and the rebuilt workflow (created by
createLoadedWorkflow(cached.workflowConfig)) and use that new object where the
original dataset was used (e.g., const datasetWithWorkflow: RuntimeMasterDataset
= { ...dataset, workflow }); ensure you reference cached.dataset and
cached.workflowConfig and replace downstream uses of dataset with
datasetWithWorkflow so the readonly semantics are preserved.
In `@src/cli/repl.ts`:
- Around line 57-60: The REPL JSON commands (status, pattern, list) are creating
success envelopes without the new validation/cache/timing metadata; update the
code paths that build responses to include the extra metadata carried on
ReplState by capturing state.validation during load/reload and passing it into
each createSuccess(...) call. Locate the ReplState usage in repl.ts and ensure
the handlers for status, pattern, and list forward state.validation (and any
cache/timing fields present on the ProcessStateAPI/RuntimeMasterDataset if
applicable) into the createSuccess calls so their JSON output matches the
process-api envelopes. Make the change in the load/reload flow so subsequent
calls reuse the populated validation metadata.
- Around line 67-79: Change loadPipeline to preserve the Result monad instead of
throwing: have loadPipeline return a Result (e.g., Result<PipelineResult,
PipelineError>) and when buildMasterDataset returns a non-ok result, return
Result.error(...) constructed from result.error (include step and message)
rather than throwing new Error; update the function signature and callers to
handle the returned Result. Apply the same pattern to the other spots in this
module that unwrap buildMasterDataset or otherwise throw on Result (the later
error-handling blocks that currently throw QueryApiError/Error) so all error
propagation in this file uses Result.error instead of exception-driven control
flow.
- Around line 113-118: The current parsing silently falls back to 'planning'
when the --session argument is missing or invalid; update the logic in the REPL
argument parsing (look for variables/session symbols sessionType, subArgs,
sessionIdx, sessionVal and helper isValidSessionType) to detect two error
conditions — a missing value after '--session' (sessionIdx !== -1 but sessionVal
is undefined) and an invalid value (sessionVal present but
!isValidSessionType(sessionVal)) — and in those cases throw or return an
INVALID_ARGUMENT error (or exit non‑zero) with a clear message instead of
defaulting to 'planning', otherwise set sessionType to the validated value.
In `@tests/features/cli/data-api-metadata.feature`:
- Around line 19-20: The feature currently only asserts metadata.validation and
pipelineMs but QueryMetadataExtra also sets metadata.cache; update the
duplicated second scenario to assert metadata.cache.hit exists (and assert
metadata.cache.ageMs is numeric when present) instead of repeating the
pipelineMs timing check so missing cache fields would fail; search for the
Scenario that duplicates the timing assertion and replace the pipelineMs check
with assertions for metadata.cache.hit and conditional numeric check for
metadata.cache.ageMs, and ensure step names/expectations reference
QueryMetadataExtra and metadata.cache.hit/ageMs.
In `@tests/steps/cli/data-api-dryrun.steps.ts`:
- Around line 75-80: The step definition "And('stdout contains dry run marker,
file counts, config, and cache status', () => { ... })" currently only asserts
TypeScript file output; add an assertion for the feature-file count by checking
the captured stdout (via getResult(state).stdout) contains the "Feature files:"
marker (e.g., expect(stdout).toContain('Feature files:')) so the dry-run file
counts contract validates both file types.
---
Duplicate comments:
In `@src/cli/dataset-cache.ts`:
- Around line 40-53: Add a Zod schema mirroring CacheMetadata and
CachedPipelineData in src/validation-schemas/ (include fields: key, timestamp,
version, dataset, workflowConfig, validation, warnings, scanMetadata) and export
its TypeScript type; then in tryLoadCache() replace the unsafe type assertion
with schema.safeParse(parsedFile) and treat a failed parse as a cache miss (do
not throw) so the function returns null/undefined per existing cache-miss
semantics; ensure you reference the schema when parsing and keep version/key
checks but only use the parsed value after safeParse succeeds.
In `@tests/features/cli/data-api-help.feature`:
- Around line 19-20: The feature test currently checks only usage and flags for
the "context --help" scenario but doesn't assert that per-subcommand examples
are rendered; update the scenario(s) in tests/features/cli/data-api-help.feature
(including the "context --help" scenario and lines 22-28) to assert that the
help output contains the example text defined in CLI_SCHEMA.commandNarratives
for that command, e.g., extract the expected example string from
CLI_SCHEMA.commandNarratives[<command>].example (or the appropriate key) and add
an assertion that the CLI output includes that example so the spec enforces
example rendering.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: bae5d844-2d55-4dcb-8d2b-616c4b5bb70b
📒 Files selected for processing (8)
src/cli/dataset-cache.tssrc/cli/repl.tstests/features/cli/data-api-dryrun.featuretests/features/cli/data-api-help.featuretests/features/cli/data-api-metadata.featuretests/steps/cli/data-api-dryrun.steps.tstests/steps/cli/data-api-help.steps.tstests/support/helpers/cli-runner.ts
| hash.update( | ||
| JSON.stringify({ | ||
| input: opts.input, | ||
| features: opts.features, | ||
| baseDir, | ||
| workflowPath: opts.workflowPath ?? null, | ||
| mergeConflictStrategy: opts.mergeConflictStrategy, | ||
| includeValidation: opts.includeValidation ?? true, | ||
| }) | ||
| ); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check if excluded options affect pipeline output by examining their usage
echo "=== Checking exclude usage ==="
rg -n -C3 'opts\.exclude|options\.exclude' --type=ts src/generators/pipeline/
echo "=== Checking contextInferenceRules usage ==="
rg -n -C3 'contextInferenceRules' --type=ts src/generators/pipeline/
echo "=== Checking failOnScanErrors usage ==="
rg -n -C3 'failOnScanErrors' --type=ts src/generators/pipeline/Repository: libar-dev/delivery-process
Length of output: 7352
Cache key missing options that affect dataset output.
The hash omits exclude and contextInferenceRules from PipelineOptions, both of which affect the resulting dataset:
excludefilters which files are scanned (lines 189, 243 in build-pipeline.ts), changing the dataset file setcontextInferenceRulesaffects context inference during transformation (line 340-345, 462), changing context assignments
Changing either option without invalidating the cache will return stale results.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/cli/dataset-cache.ts` around lines 83 - 92, The cache key generation
omits PipelineOptions that affect dataset output; update the JSON passed to
hash.update (the block that currently includes input, features, baseDir,
workflowPath, mergeConflictStrategy, includeValidation) to also include
opts.exclude and opts.contextInferenceRules (normalizing defaults such as null
or empty arrays/objects) so changes to excluded files or context inference rules
invalidate the cache; locate the same JSON object where hash.update(...) is
called and add those two properties.
| // Reconstruct RuntimeMasterDataset from plain MasterDataset + WorkflowConfig | ||
| const dataset = cached.dataset as RuntimeMasterDataset; | ||
| if (cached.workflowConfig !== null) { | ||
| const workflow = createLoadedWorkflow(cached.workflowConfig); | ||
| // Assign workflow back onto the deserialized dataset (Maps are not JSON-serializable) | ||
| Object.assign(dataset, { workflow }); | ||
| } |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Workflow reconstruction approach is sound, but consider immutable pattern.
Object.assign(dataset, { workflow }) mutates the freshly deserialized object, which works but circumvents the readonly modifier on RuntimeMasterDataset.workflow. Consider creating a new object to maintain immutability semantics:
const datasetWithWorkflow: RuntimeMasterDataset = { ...dataset, workflow };This is a minor style preference since the object is freshly parsed and not shared.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/cli/dataset-cache.ts` around lines 144 - 150, The current code mutates
the deserialized dataset via Object.assign(dataset, { workflow }) which bypasses
the readonly RuntimeMasterDataset.workflow; instead construct a new immutable
object combining the parsed dataset and the rebuilt workflow (created by
createLoadedWorkflow(cached.workflowConfig)) and use that new object where the
original dataset was used (e.g., const datasetWithWorkflow: RuntimeMasterDataset
= { ...dataset, workflow }); ensure you reference cached.dataset and
cached.workflowConfig and replace downstream uses of dataset with
datasetWithWorkflow so the readonly semantics are preserved.
| const cachePath = path.join(cacheDir, 'dataset.json'); | ||
| const tmpPath = `${cachePath}.${process.pid}.${Date.now()}.tmp`; | ||
|
|
||
| await fsp.writeFile(tmpPath, JSON.stringify(cacheData), 'utf-8'); | ||
| await fsp.rename(tmpPath, cachePath); | ||
| } catch { | ||
| // Cache write failure is not fatal — next run will rebuild | ||
| } |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Atomic write pattern is correct with unique temp filename.
The pid.timestamp.tmp suffix addresses concurrent writer safety. The tmp-then-rename approach ensures readers never see partial writes.
Minor consideration: if rename fails after writeFile succeeds, the temp file remains orphaned. This is acceptable under DD-4's non-fatal policy, but over time could leave debris. Consider adding cleanup in the catch block:
♻️ Optional cleanup of orphaned temp file
await fsp.writeFile(tmpPath, JSON.stringify(cacheData), 'utf-8');
await fsp.rename(tmpPath, cachePath);
} catch {
// Cache write failure is not fatal — next run will rebuild
+ // Attempt cleanup of orphaned temp file
+ try { await fsp.unlink(tmpPath); } catch { /* ignore */ }
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const cachePath = path.join(cacheDir, 'dataset.json'); | |
| const tmpPath = `${cachePath}.${process.pid}.${Date.now()}.tmp`; | |
| await fsp.writeFile(tmpPath, JSON.stringify(cacheData), 'utf-8'); | |
| await fsp.rename(tmpPath, cachePath); | |
| } catch { | |
| // Cache write failure is not fatal — next run will rebuild | |
| } | |
| const cachePath = path.join(cacheDir, 'dataset.json'); | |
| const tmpPath = `${cachePath}.${process.pid}.${Date.now()}.tmp`; | |
| await fsp.writeFile(tmpPath, JSON.stringify(cacheData), 'utf-8'); | |
| await fsp.rename(tmpPath, cachePath); | |
| } catch { | |
| // Cache write failure is not fatal — next run will rebuild | |
| // Attempt cleanup of orphaned temp file | |
| try { await fsp.unlink(tmpPath); } catch { /* ignore */ } | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/cli/dataset-cache.ts` around lines 203 - 210, The atomic write is fine
but orphaned temp files may accumulate if fsp.rename fails; update the try/catch
around the write/rename in the function that defines cachePath and tmpPath so
the catch attempts to remove the tmpPath (use fsp.unlink or equivalent) and
ignore ENOENT errors, ensuring cleanup of the
`${cachePath}.${process.pid}.${Date.now()}.tmp` file created by fsp.writeFile;
keep the overall non-fatal behavior (do not rethrow) if cleanup fails.
| interface ReplState { | ||
| api: ProcessStateAPI; | ||
| dataset: RuntimeMasterDataset; | ||
| validation: ValidationSummary; |
There was a problem hiding this comment.
REPL JSON commands are dropping the new metadata fields.
status, pattern, and list all build success envelopes without the extra validation/cache/timing metadata that process-api now emits elsewhere. ReplState already carries validation, so this looks like unfinished wiring; capture the extra once during load/reload and pass it into each createSuccess(...) call.
Also applies to: 95-100, 150-167
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/cli/repl.ts` around lines 57 - 60, The REPL JSON commands (status,
pattern, list) are creating success envelopes without the new
validation/cache/timing metadata; update the code paths that build responses to
include the extra metadata carried on ReplState by capturing state.validation
during load/reload and passing it into each createSuccess(...) call. Locate the
ReplState usage in repl.ts and ensure the handlers for status, pattern, and list
forward state.validation (and any cache/timing fields present on the
ProcessStateAPI/RuntimeMasterDataset if applicable) into the createSuccess calls
so their JSON output matches the process-api envelopes. Make the change in the
load/reload flow so subsequent calls reuse the populated validation metadata.
| async function loadPipeline(opts: ReplOptions): Promise<PipelineResult> { | ||
| const result = await buildMasterDataset({ | ||
| input: opts.input, | ||
| features: opts.features, | ||
| baseDir: opts.baseDir, | ||
| mergeConflictStrategy: 'fatal', | ||
| ...(opts.workflowPath !== null ? { workflowPath: opts.workflowPath } : {}), | ||
| }); | ||
| if (!result.ok) { | ||
| throw new Error(`Pipeline error [${result.error.step}]: ${result.error.message}`); | ||
| } | ||
| return result.value; | ||
| } |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Avoid exception-driven control flow in this src/ module.
buildMasterDataset() already returns a Result, but this code unwraps expected failure paths by throwing Error/QueryApiError. That makes REPL startup and command validation follow a different error path than the rest of the CLI and diverges from the repo's src/ error-handling convention. As per coding guidelines: "Use Result Monad pattern: functions return Result.ok(value) or Result.error(err) instead of throwing exceptions".
Also applies to: 108-112, 130-153
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/cli/repl.ts` around lines 67 - 79, Change loadPipeline to preserve the
Result monad instead of throwing: have loadPipeline return a Result (e.g.,
Result<PipelineResult, PipelineError>) and when buildMasterDataset returns a
non-ok result, return Result.error(...) constructed from result.error (include
step and message) rather than throwing new Error; update the function signature
and callers to handle the returned Result. Apply the same pattern to the other
spots in this module that unwrap buildMasterDataset or otherwise throw on Result
(the later error-handling blocks that currently throw QueryApiError/Error) so
all error propagation in this file uses Result.error instead of exception-driven
control flow.
| let sessionType: SessionType = 'planning'; | ||
| const sessionIdx = subArgs.indexOf('--session'); | ||
| const sessionVal = sessionIdx !== -1 ? subArgs[sessionIdx + 1] : undefined; | ||
| if (sessionVal !== undefined && isValidSessionType(sessionVal)) { | ||
| sessionType = sessionVal; | ||
| } |
There was a problem hiding this comment.
Reject invalid --session values instead of silently falling back to planning.
A typo or missing value after --session still returns a successful planning context, which is misleading. Treat that as INVALID_ARGUMENT so the REPL never produces the wrong result with exit code 0.
Suggested guard
let sessionType: SessionType = 'planning';
const sessionIdx = subArgs.indexOf('--session');
const sessionVal = sessionIdx !== -1 ? subArgs[sessionIdx + 1] : undefined;
- if (sessionVal !== undefined && isValidSessionType(sessionVal)) {
- sessionType = sessionVal;
+ if (sessionIdx !== -1) {
+ if (sessionVal === undefined || !isValidSessionType(sessionVal)) {
+ throw new QueryApiError(
+ 'INVALID_ARGUMENT',
+ 'Usage: context <pattern> [--session <type>]'
+ );
+ }
+ sessionType = sessionVal;
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| let sessionType: SessionType = 'planning'; | |
| const sessionIdx = subArgs.indexOf('--session'); | |
| const sessionVal = sessionIdx !== -1 ? subArgs[sessionIdx + 1] : undefined; | |
| if (sessionVal !== undefined && isValidSessionType(sessionVal)) { | |
| sessionType = sessionVal; | |
| } | |
| let sessionType: SessionType = 'planning'; | |
| const sessionIdx = subArgs.indexOf('--session'); | |
| const sessionVal = sessionIdx !== -1 ? subArgs[sessionIdx + 1] : undefined; | |
| if (sessionIdx !== -1) { | |
| if (sessionVal === undefined || !isValidSessionType(sessionVal)) { | |
| throw new QueryApiError( | |
| 'INVALID_ARGUMENT', | |
| 'Usage: context <pattern> [--session <type>]' | |
| ); | |
| } | |
| sessionType = sessionVal; | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/cli/repl.ts` around lines 113 - 118, The current parsing silently falls
back to 'planning' when the --session argument is missing or invalid; update the
logic in the REPL argument parsing (look for variables/session symbols
sessionType, subArgs, sessionIdx, sessionVal and helper isValidSessionType) to
detect two error conditions — a missing value after '--session' (sessionIdx !==
-1 but sessionVal is undefined) and an invalid value (sessionVal present but
!isValidSessionType(sessionVal)) — and in those cases throw or return an
INVALID_ARGUMENT error (or exit non‑zero) with a clear message instead of
defaulting to 'planning', otherwise set sessionType to the validated value.
| **Invariant:** Every JSON response envelope must include a metadata.validation object with danglingReferenceCount, malformedPatternCount, unknownStatusCount, and warningCount fields, plus a numeric pipelineMs timing. | ||
| **Rationale:** Consumers use validation counts to detect annotation quality degradation without running a separate validation pass. Pipeline timing enables performance regression detection in CI. |
There was a problem hiding this comment.
metadata.cache is missing from the acceptance coverage.
QueryMetadataExtra now includes cache hit/miss data too, but this spec only exercises metadata.validation and pipelineMs. Because the second scenario duplicates the first scenario's timing assertion, a regression that drops metadata.cache will still pass. Repurpose that scenario to assert metadata.cache.hit (and ageMs when present).
Also applies to: 31-36
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/features/cli/data-api-metadata.feature` around lines 19 - 20, The
feature currently only asserts metadata.validation and pipelineMs but
QueryMetadataExtra also sets metadata.cache; update the duplicated second
scenario to assert metadata.cache.hit exists (and assert metadata.cache.ageMs is
numeric when present) instead of repeating the pipelineMs timing check so
missing cache fields would fail; search for the Scenario that duplicates the
timing assertion and replace the pipelineMs check with assertions for
metadata.cache.hit and conditional numeric check for metadata.cache.ageMs, and
ensure step names/expectations reference QueryMetadataExtra and
metadata.cache.hit/ageMs.
| And('stdout contains dry run marker, file counts, config, and cache status', () => { | ||
| const stdout = getResult(state).stdout; | ||
| expect(stdout).toContain('DRY RUN'); | ||
| expect(stdout).toContain('TypeScript files:'); | ||
| expect(stdout).toContain('Config:'); | ||
| expect(stdout).toContain('Cache:'); |
There was a problem hiding this comment.
Assert the feature-file count as well.
This step only checks TypeScript files: even though the dry-run output includes both TypeScript and feature counts. If Feature files: disappears, the scenario still passes despite its "file counts" contract.
Suggested assertion
And('stdout contains dry run marker, file counts, config, and cache status', () => {
const stdout = getResult(state).stdout;
expect(stdout).toContain('DRY RUN');
expect(stdout).toContain('TypeScript files:');
+ expect(stdout).toContain('Feature files:');
expect(stdout).toContain('Config:');
expect(stdout).toContain('Cache:');
});📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| And('stdout contains dry run marker, file counts, config, and cache status', () => { | |
| const stdout = getResult(state).stdout; | |
| expect(stdout).toContain('DRY RUN'); | |
| expect(stdout).toContain('TypeScript files:'); | |
| expect(stdout).toContain('Config:'); | |
| expect(stdout).toContain('Cache:'); | |
| And('stdout contains dry run marker, file counts, config, and cache status', () => { | |
| const stdout = getResult(state).stdout; | |
| expect(stdout).toContain('DRY RUN'); | |
| expect(stdout).toContain('TypeScript files:'); | |
| expect(stdout).toContain('Feature files:'); | |
| expect(stdout).toContain('Config:'); | |
| expect(stdout).toContain('Cache:'); |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/steps/cli/data-api-dryrun.steps.ts` around lines 75 - 80, The step
definition "And('stdout contains dry run marker, file counts, config, and cache
status', () => { ... })" currently only asserts TypeScript file output; add an
assertion for the feature-file count by checking the captured stdout (via
getResult(state).stdout) contains the "Feature files:" marker (e.g.,
expect(stdout).toContain('Feature files:')) so the dry-run file counts contract
validates both file types.
Summary
dataset-cache.ts) — sha256 mtime-based invalidation drops repeated queries from 2-5s to <200ms. Cache stored atnode_modules/.cache/delivery-process/dataset.jsonwith atomic writes. Bypass with--no-cache.repl.ts) —process-api replloads pipeline once for interactive multi-query sessions. Supportsreload,help,quit. Dual-mode: TTY (interactive prompts) and piped (batch for tests/scripts).process-api context --helprenders usage, applicable flags, and examples fromCLI_SCHEMA.commandNarratives.--dry-runresolves globs and prints pipeline scope without running scan/extract/transform.QueryMetadataExtraenriches every response with validation counts, cache hit/miss status, and pipeline timing.Unblocks MCPServerIntegration (Phase 46) — the single most important feature for the product roadmap.
Test plan
pnpm validate:allpasses (including DoD for all 6 deliverables)pnpm docs:allregeneratedSummary by CodeRabbit
New Features
Tests
Chores