Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 132323148d
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| args_hash=args_hash, | ||
| input_fingerprint=input_fp, | ||
| ) | ||
| already_finished = can_resume and bool((state or {}).get("finished")) |
There was a problem hiding this comment.
Gate finished short-circuit on output availability
prepare_resume_context now marks runs as already_finished based only on the state file’s finished flag, but several command entrypoints (convert.main, segment.main, phase.main, radiomics.main) immediately return on that flag without verifying that the final output CSV still exists. If a previous run finalized state and the output file is later deleted or lost, --resume will no-op and leave the command with no produced output, which breaks recovery workflows that rely on checkpoint artifacts.
Useful? React with 👍 / 👎.
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
This PR improves checkpoint resume behavior across long-running commands and fixes a segment-specific resume inefficiency.
When --resume is enabled and checkpoint state already indicates the run is finished, commands now short-circuit instead of re-entering execution.
Additionally, in segment, postprocessing is now skipped for rows that were already completed (i.e., outputs already exist and no task was re-run), preventing unnecessary overwrite work.
What changed
Added already_finished in resume context (run_state.prepare_resume_context).
parse, convert, segment, phase, and radiomics now return early when:
resume state matches current args/input fingerprint, and
checkpoint state has finished=true.
Moved expensive setup to occur after the early-finish check where relevant:
segment: config/model prefetch deferred until after resume-finished guard.
phase: phase extractor loading deferred.
radiomics: dependency loading and extractor creation deferred.
In segment_volume, added task-run tracking (ran_any_task).
If merged output already exists, force=False, and no task executed for that row, postprocessing is skipped.
This avoids re-running merge/cleanup for rows that were already processed before interruption.
Why this matters