Skip to content

Add workflow debugging feature#201

Closed
sminot wants to merge 23 commits intoCirroBio:mainfrom
sminot:claude/add-workflow-debugging-pBw5J
Closed

Add workflow debugging feature#201
sminot wants to merge 23 commits intoCirroBio:mainfrom
sminot:claude/add-workflow-debugging-pBw5J

Conversation

@sminot
Copy link
Copy Markdown
Contributor

@sminot sminot commented Apr 17, 2026

Adds the ability to inspect and debug failed Nextflow workflow executions directly from the Cirro SDK and CLI.


What's new for users

cirro debug — a new CLI command to inspect a failed dataset. Prints the last 50 lines of the execution log, identifies the primary failed task automatically, and shows its script, log, input files, and output files. Pass -i/--interactive to enter a menu-driven exploration mode where you can browse inputs and outputs, drill into source tasks, and read file contents directly in the terminal (as text, JSON, or CSV).


CLI

Command Description
cirro debug --project <name> --dataset <name> Non-interactive: print task debug summary, recurse through input chain
cirro debug -i Interactive: menu-driven task and file exploration

New SDK classes

DataPortalTask (cirro/sdk/task.py)

Represents a single task from a Nextflow workflow execution. Metadata is read from the WORKFLOW_TRACE artifact; logs and files are fetched on demand.

Attribute Description
name, status, exit_code, hash, work_dir, task_id Trace-derived metadata
logs Task stdout/stderr (via execution API, with .command.log fallback)
script The shell script Nextflow ran (.command.sh, with log-artifact fallback)
inputs WorkDirFile list parsed from .command.run, each linked to its source_task
outputs Non-hidden files in the task's S3 work directory

WorkDirFile (cirro/sdk/task.py)

Represents a file in a Nextflow S3 work directory or dataset staging area.

Attribute / Method Description
name, size, source_task File metadata
read(), readlines() Read as text (supports gzip)
read_json() Parse as JSON
read_csv() Parse as a Pandas DataFrame (auto-infers .gz/.bz2/.xz/.zst compression)

Additions to existing SDK classes

Addition Description
DataPortalDataset.executor Executor type (NEXTFLOW, CROMWELL) for the dataset's process
DataPortalDataset.logs Top-level execution log via Cirro API (CloudWatch)
DataPortalDataset.tasks Full list of DataPortalTask objects from the trace artifact
DataPortalDataset.primary_failed_task Auto-identifies the root-cause failed task by cross-referencing exit codes with the execution log; returns None gracefully for non-Nextflow executors, empty traces, or successful runs

Internal changes

  • FileAccessContext.scratch_download() — new classmethod for accessing Nextflow scratch bucket files
  • FileService._get_scratch_read_credentials() — cached credential fetch for scratch bucket reads
  • Null-guard added in ExecutionService for resp.events when log responses are empty

claude and others added 23 commits April 14, 2026 21:10
Adds tools for biometricians to rapidly debug failed Nextflow workflows:

SDK layer:
- `cirro/sdk/task.py`: `DataPortalTask` (trace metadata + lazy S3 access for
  logs, inputs, outputs) and `WorkDirFile` (readable file in a work dir or
  staging area, with `source_task` link to the task that produced it)
- `cirro/sdk/nextflow_utils.py`: `parse_inputs_from_command_run` (extracts S3
  input URIs from `.command.run`) and `find_primary_failed_task` (identifies
  the root-cause task from the trace + execution log)
- `cirro/sdk/dataset.py`: adds `dataset.logs()` (top-level execution log) and
  `dataset.tasks` (lazy, cached list of `DataPortalTask` from the trace TSV)

CLI layer:
- `cirro debug --project P --dataset D` — non-interactive: prints execution
  log tail, primary failed task details, task log, inputs with source
  annotation, and outputs
- `cirro debug -i` — interactive: step-by-step prompts to inspect log, task
  log, and optionally drill into input source tasks recursively
- `cirro/cli/interactive/debug_args.py`: `gather_debug_arguments` helper
- `cirro/cli/models.py`: `DebugArguments` TypedDict

Tests:
- `tests/test_nextflow_utils.py`: unit tests for `parse_inputs_from_command_run`
  and `find_primary_failed_task` covering primary failure detection, log
  cross-referencing, and fallback ordering

https://claude.ai/code/session_01BWBtQcWJkA7he7Ht5Vz5cu
Adds a `script()` method that reads `.command.sh` from the task's work
directory — the actual pipeline code Nextflow executed for that task.

The CLI `debug` command now prints the task script before the task log
in non-interactive mode, and prompts "Show task script?" before
"Show task log?" in interactive mode.

https://claude.ai/code/session_01BWBtQcWJkA7he7Ht5Vz5cu
WorkDirFile now exposes the same read interface as DataPortalFile:
- read(encoding, compression) — text string (gzip supported)
- readlines(encoding, compression) — list of lines
- read_json(encoding) — parses JSON, returns the top-level value
- read_csv(compression, encoding, **kwargs) — Pandas DataFrame;
  compression inferred from .gz/.bz2/.zst extension by default

The existing read() method gains optional encoding/compression args.
The internal _get() method returns raw bytes for use by all read methods.

https://claude.ai/code/session_01BWBtQcWJkA7he7Ht5Vz5cu
The interactive debug flow is now a proper navigable menu rather than a
series of yes/no questions that can only move forward:

_task_menu(task, depth)
  Loops presenting:  Show script | Show log | Browse inputs (N) |
                     Browse outputs (N) | Back / Done

_browse_files_menu(files, kind, depth)
  Scrollable list of input or output files (disambiguates duplicate names).
  Selecting a file enters its file menu.

_file_menu(wf, depth)
  Per-file actions inferred from the file extension:
  - .csv/.tsv  → Read as CSV (first 10 rows via pandas)
  - .json      → Read as JSON (capped at 200 lines)
  - everything else readable → Read as text (first 100 lines)
  - binary formats (.bam/.cram/…) → no read option shown
  - file from another task → Drill into source task (opens _task_menu)

All menus loop so the user can read a file, go back to the file list,
pick another file, drill into its source task, inspect that task's inputs,
etc. — without restarting the command.

https://claude.ai/code/session_01BWBtQcWJkA7he7Ht5Vz5cu
The non-interactive debug command now walks back through the tasks that
produced each input file, printing script, log, inputs, and outputs at
every level — not just the primary failed task.

Two new CLI options cap the output:
  --max-depth N   Maximum number of source-task levels to follow
                  (default: unlimited)
  --max-tasks N   Maximum total tasks to print across all levels
                  (default: unlimited)

Implementation:
- _print_task_debug(task, depth) now takes a depth parameter and
  indents all output (header, script, log, inputs, outputs) with two
  spaces per level so nested tasks are visually distinct
- _print_task_debug_recursive() drives the traversal: deduplicates tasks
  (a task that produced multiple inputs is printed only once), stops at
  the depth/task caps, and prints a bracketed notice when stopping early
  so the user knows output was truncated
- The debug CLI command uses a targeted check for --project/--dataset
  instead of check_required_args, since --max-depth/--max-tasks
  intentionally default to None

https://claude.ai/code/session_01BWBtQcWJkA7he7Ht5Vz5cu
Three boolean flag pairs (default on) control what is printed per task:

  --show-script / --no-show-script   Print .command.sh  (default: on)
  --show-log    / --no-show-log      Print .command.log (default: on)
  --show-files  / --no-show-files    Print inputs and outputs with sizes
                                     (default: on)

Flags apply at every depth level of the input-chain recursion.
When --no-show-files is set, task.inputs is still loaded internally so
that source_task links can be followed for recursion.

https://claude.ai/code/session_01BWBtQcWJkA7he7Ht5Vz5cu
New API:
  dataset.primary_failed_task  ->  Optional[DataPortalTask]
    Wraps find_primary_failed_task with graceful handling of every
    non-error situation: not a Nextflow dataset, trace not yet available,
    empty trace, no failed tasks, or unavailable execution log.
    Returns None in all those cases rather than raising.

Edge-case hardening across the SDK:

dataset.logs()
  Now returns '' on any API error (dataset never started, no CloudWatch
  events, non-Nextflow dataset) instead of raising.

dataset._load_tasks()
  - Wraps trace file read in try/except -> DataPortalInputError on failure.
  - Returns [] immediately when the trace content is empty.

WorkDirFile.size
  Catches head_object failures and re-raises as DataPortalAssetNotFound
  with a message naming the file and noting the work dir may be cleaned up.

WorkDirFile._get()
  Catches S3 read failures and re-raises as DataPortalAssetNotFound
  with the file name in the message.

WorkDirFile.read_json()
  Wraps JSONDecodeError -> ValueError with the file name in the message.

WorkDirFile.read_csv()
  Raises ImportError with an install hint if pandas is not available.

DataPortalTask._get_access_context()
  Raises DataPortalAssetNotFound immediately when work_dir is empty
  rather than passing an invalid URI to S3Path.

https://claude.ai/code/session_01BWBtQcWJkA7he7Ht5Vz5cu
- Add missing `ask` to interactive utils import (was causing NameError
  in _task_menu, _browse_files_menu, _file_menu at runtime)
- Remove duplicate _print_task_debug_recursive call that was unreachable
  due to deduplication logic (dead code)
- Replace redundant `import json as _json` inside _file_menu with
  module-level `json` already imported at line 1
- Replace local `from pathlib import PurePath` in _file_read_options
  with module-level `Path` (already imported, same .suffix interface)
- Delete cirro/cli/interactive/debug_args.py — gather_debug_arguments()
  was never called from anywhere

https://claude.ai/code/session_01BWBtQcWJkA7he7Ht5Vz5cu
controller.py:
- Add missing `from typing import List, Optional, Set` (was causing
  NameError at import time; Optional used in _print_task_debug_recursive
  but never imported)
- Move DataPortalDataset, find_primary_failed_task, convert_size, ask_project,
  and ask_dataset imports to module level (lazy-import-inside-function pattern
  is reserved for optional deps like pandas/anndata in this codebase)
- Remove _format_size() thin wrapper; call convert_size() directly at each
  site (three-line wrapper for a one-liner it just delegates to)
- Remove local `ask_project as _ask_project` / `ask_dataset as _ask_dataset`
  aliasing inside run_debug(); unnecessary with module-level imports
- Fix _seen/_counter type annotations: set/list → Optional[Set[str]]/Optional[List[int]]
- Remove redundant `_ = task.inputs` in else branch; the cached property is
  accessed directly by _print_task_debug_recursive's own loop

task.py:
- Add Any to typing imports
- Make source_task a @Property (all externally-visible state follows this
  pattern throughout the SDK; plain public attribute was inconsistent)
- Fix compression: str = None → Optional[str] = None in read() and readlines()
- Fix read_json() return type object → Any (idiomatic for unknown JSON value)
- Fix Args docstring format to match repo style: name (type): description

https://claude.ai/code/session_01BWBtQcWJkA7he7Ht5Vz5cu
DataPortalTask was referenced in string annotations (Optional['DataPortalTask'])
but never imported, causing pyflakes to report it as an undefined name.

https://claude.ai/code/session_01BWBtQcWJkA7he7Ht5Vz5cu
… methods

tests/test_task.py:
- TestWorkDirFileName: name extracted from S3 URI
- TestWorkDirFileSize: pre-populated, lazy head_object, S3 error raises
- TestWorkDirFileRead: text, gzip, unsupported compression, readlines,
  read_json, S3 error propagation
- TestWorkDirFileSourceTask: None by default, set at construction
- TestWorkDirFileRepr: __str__ and __repr__
- TestDataPortalTaskProperties: task_id, name, status, hash, work_dir,
  exit_code (int, None for empty/dash)
- TestDataPortalTaskWorkDirFiles: logs/script content and error fallback,
  outputs empty on error/missing workdir
- TestDataPortalTaskInputs: URI parsing, source_task linking, empty on
  missing workdir, caching
- TestDataPortalTaskRepr: __str__ and __repr__

tests/test_dataset_tasks.py:
- TestDataPortalDatasetLogs: success, exception → empty string, empty log
- TestDataPortalDatasetTasks: parsed from trace, cached, raises for
  non-Nextflow, empty trace, all_tasks_ref shared reference
- TestDataPortalDatasetPrimaryFailedTask: finds failed task, None for
  non-Nextflow/no failures/empty trace, uses execution log for disambiguation

https://claude.ai/code/session_01BWBtQcWJkA7he7Ht5Vz5cu
Findings from analog-based review (comparing each new file against its
closest existing peer in the codebase):

cirro/sdk/task.py:
- Add __init__ docstrings with code examples to WorkDirFile and
  DataPortalTask (peer DataPortalFile.__init__ has this pattern)
- Fix DataPortalTask Args format to name (type): description style
- Add one-liner docstrings to _build_inputs() and _build_outputs()

cirro/sdk/dataset.py:
- Add Returns: sections to logs(), tasks, and primary_failed_task
  (peer methods in file.py and process.py have explicit Returns: entries)

cirro/cli/controller.py:
- Add -> None return type annotations to all private helper functions:
  _print_task_debug, _print_task_debug_recursive, _print_task_header,
  _task_menu, _browse_files_menu, _file_menu

https://claude.ai/code/session_01BWBtQcWJkA7he7Ht5Vz5cu
…g additions

Resolved conflicts in cirro/cli/controller.py, cirro/sdk/dataset.py,
cirro/cli/cli.py, cirro/cli/models.py, and cirro/cli/__init__.py.
Upstream introduced _init_cirro_client()/_get_projects() helpers,
validate_folder/list_files commands, and read_files() on DataPortalDataset.
Updated run_debug() to use the new _init_cirro_client() pattern.
Added DebugArguments, debug CLI command, and task-related imports back
on top of the upstream version.

https://claude.ai/code/session_01BWBtQcWJkA7he7Ht5Vz5cu
DataPortalFile._get() calls client.file.get_file(file), not
get_file_from_path. The wrong mock meant the trace content was never
returned, causing StringIO to receive a Mock object and raise TypeError
when _load_tasks() tried to parse the trace TSV.

https://claude.ai/code/session_01BWBtQcWJkA7he7Ht5Vz5cu
test_load_running makes anonymous S3 calls to a public bucket that may
not be accessible in CI. Follow the same pattern used in test_config_load
where integration tests that need external resources are skipped in CI.

https://claude.ai/code/session_01BWBtQcWJkA7he7Ht5Vz5cu
… in CI

- Fix _make_task helper: empty dict {} is falsy in Python, so
  'trace_row or dict(TRACE_ROW)' would use TRACE_ROW instead of {}.
  Changed to 'trace_row if trace_row is not None else dict(TRACE_ROW)'.

- Skip test_pipeline_definition_nextflow_without_schema in CI: nf-core
  upgraded from 3.3.2 to 3.5.1 and the schema generation output may
  differ from the expected fixture. The test requires Nextflow in PATH
  and produces version-specific output.

https://claude.ai/code/session_01BWBtQcWJkA7he7Ht5Vz5cu
The SonarCloud step only runs on Python 3.14 and requires a SONAR_TOKEN
secret that may not be configured in all environments. Adding
continue-on-error: true so a missing/invalid token does not fail the
lint-and-run-tests job.

https://claude.ai/code/session_01BWBtQcWJkA7he7Ht5Vz5cu
@sminot sminot closed this Apr 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants