Add workflow debugging feature by sminot · Pull Request #201 · CirroBio/Cirro-SDK-Python

sminot · 2026-04-17T21:06:40Z

Adds the ability to inspect and debug failed Nextflow workflow executions directly from the Cirro SDK and CLI.

What's new for users

cirro debug — a new CLI command to inspect a failed dataset. Prints the last 50 lines of the execution log, identifies the primary failed task automatically, and shows its script, log, input files, and output files. Pass -i/--interactive to enter a menu-driven exploration mode where you can browse inputs and outputs, drill into source tasks, and read file contents directly in the terminal (as text, JSON, or CSV).

CLI

Command	Description
`cirro debug --project <name> --dataset <name>`	Non-interactive: print task debug summary, recurse through input chain
`cirro debug -i`	Interactive: menu-driven task and file exploration

New SDK classes

DataPortalTask (cirro/sdk/task.py)

Represents a single task from a Nextflow workflow execution. Metadata is read from the WORKFLOW_TRACE artifact; logs and files are fetched on demand.

Attribute	Description
`name`, `status`, `exit_code`, `hash`, `work_dir`, `task_id`	Trace-derived metadata
`logs`	Task stdout/stderr (via execution API, with `.command.log` fallback)
`script`	The shell script Nextflow ran (`.command.sh`, with log-artifact fallback)
`inputs`	`WorkDirFile` list parsed from `.command.run`, each linked to its `source_task`
`outputs`	Non-hidden files in the task's S3 work directory

WorkDirFile (cirro/sdk/task.py)

Represents a file in a Nextflow S3 work directory or dataset staging area.

Attribute / Method	Description
`name`, `size`, `source_task`	File metadata
`read()`, `readlines()`	Read as text (supports gzip)
`read_json()`	Parse as JSON
`read_csv()`	Parse as a Pandas DataFrame (auto-infers `.gz`/`.bz2`/`.xz`/`.zst` compression)

Additions to existing SDK classes

Addition	Description
`DataPortalDataset.executor`	Executor type (`NEXTFLOW`, `CROMWELL`) for the dataset's process
`DataPortalDataset.logs`	Top-level execution log via Cirro API (CloudWatch)
`DataPortalDataset.tasks`	Full list of `DataPortalTask` objects from the trace artifact
`DataPortalDataset.primary_failed_task`	Auto-identifies the root-cause failed task by cross-referencing exit codes with the execution log; returns `None` gracefully for non-Nextflow executors, empty traces, or successful runs

Internal changes

FileAccessContext.scratch_download() — new classmethod for accessing Nextflow scratch bucket files
FileService._get_scratch_read_credentials() — cached credential fetch for scratch bucket reads
Null-guard added in ExecutionService for resp.events when log responses are empty

Adds tools for biometricians to rapidly debug failed Nextflow workflows: SDK layer: - `cirro/sdk/task.py`: `DataPortalTask` (trace metadata + lazy S3 access for logs, inputs, outputs) and `WorkDirFile` (readable file in a work dir or staging area, with `source_task` link to the task that produced it) - `cirro/sdk/nextflow_utils.py`: `parse_inputs_from_command_run` (extracts S3 input URIs from `.command.run`) and `find_primary_failed_task` (identifies the root-cause task from the trace + execution log) - `cirro/sdk/dataset.py`: adds `dataset.logs()` (top-level execution log) and `dataset.tasks` (lazy, cached list of `DataPortalTask` from the trace TSV) CLI layer: - `cirro debug --project P --dataset D` — non-interactive: prints execution log tail, primary failed task details, task log, inputs with source annotation, and outputs - `cirro debug -i` — interactive: step-by-step prompts to inspect log, task log, and optionally drill into input source tasks recursively - `cirro/cli/interactive/debug_args.py`: `gather_debug_arguments` helper - `cirro/cli/models.py`: `DebugArguments` TypedDict Tests: - `tests/test_nextflow_utils.py`: unit tests for `parse_inputs_from_command_run` and `find_primary_failed_task` covering primary failure detection, log cross-referencing, and fallback ordering https://claude.ai/code/session_01BWBtQcWJkA7he7Ht5Vz5cu

Adds a `script()` method that reads `.command.sh` from the task's work directory — the actual pipeline code Nextflow executed for that task. The CLI `debug` command now prints the task script before the task log in non-interactive mode, and prompts "Show task script?" before "Show task log?" in interactive mode. https://claude.ai/code/session_01BWBtQcWJkA7he7Ht5Vz5cu

WorkDirFile now exposes the same read interface as DataPortalFile: - read(encoding, compression) — text string (gzip supported) - readlines(encoding, compression) — list of lines - read_json(encoding) — parses JSON, returns the top-level value - read_csv(compression, encoding, **kwargs) — Pandas DataFrame; compression inferred from .gz/.bz2/.zst extension by default The existing read() method gains optional encoding/compression args. The internal _get() method returns raw bytes for use by all read methods. https://claude.ai/code/session_01BWBtQcWJkA7he7Ht5Vz5cu

The interactive debug flow is now a proper navigable menu rather than a series of yes/no questions that can only move forward: _task_menu(task, depth) Loops presenting: Show script | Show log | Browse inputs (N) | Browse outputs (N) | Back / Done _browse_files_menu(files, kind, depth) Scrollable list of input or output files (disambiguates duplicate names). Selecting a file enters its file menu. _file_menu(wf, depth) Per-file actions inferred from the file extension: - .csv/.tsv → Read as CSV (first 10 rows via pandas) - .json → Read as JSON (capped at 200 lines) - everything else readable → Read as text (first 100 lines) - binary formats (.bam/.cram/…) → no read option shown - file from another task → Drill into source task (opens _task_menu) All menus loop so the user can read a file, go back to the file list, pick another file, drill into its source task, inspect that task's inputs, etc. — without restarting the command. https://claude.ai/code/session_01BWBtQcWJkA7he7Ht5Vz5cu

The non-interactive debug command now walks back through the tasks that produced each input file, printing script, log, inputs, and outputs at every level — not just the primary failed task. Two new CLI options cap the output: --max-depth N Maximum number of source-task levels to follow (default: unlimited) --max-tasks N Maximum total tasks to print across all levels (default: unlimited) Implementation: - _print_task_debug(task, depth) now takes a depth parameter and indents all output (header, script, log, inputs, outputs) with two spaces per level so nested tasks are visually distinct - _print_task_debug_recursive() drives the traversal: deduplicates tasks (a task that produced multiple inputs is printed only once), stops at the depth/task caps, and prints a bracketed notice when stopping early so the user knows output was truncated - The debug CLI command uses a targeted check for --project/--dataset instead of check_required_args, since --max-depth/--max-tasks intentionally default to None https://claude.ai/code/session_01BWBtQcWJkA7he7Ht5Vz5cu

Three boolean flag pairs (default on) control what is printed per task: --show-script / --no-show-script Print .command.sh (default: on) --show-log / --no-show-log Print .command.log (default: on) --show-files / --no-show-files Print inputs and outputs with sizes (default: on) Flags apply at every depth level of the input-chain recursion. When --no-show-files is set, task.inputs is still loaded internally so that source_task links can be followed for recursion. https://claude.ai/code/session_01BWBtQcWJkA7he7Ht5Vz5cu

New API: dataset.primary_failed_task -> Optional[DataPortalTask] Wraps find_primary_failed_task with graceful handling of every non-error situation: not a Nextflow dataset, trace not yet available, empty trace, no failed tasks, or unavailable execution log. Returns None in all those cases rather than raising. Edge-case hardening across the SDK: dataset.logs() Now returns '' on any API error (dataset never started, no CloudWatch events, non-Nextflow dataset) instead of raising. dataset._load_tasks() - Wraps trace file read in try/except -> DataPortalInputError on failure. - Returns [] immediately when the trace content is empty. WorkDirFile.size Catches head_object failures and re-raises as DataPortalAssetNotFound with a message naming the file and noting the work dir may be cleaned up. WorkDirFile._get() Catches S3 read failures and re-raises as DataPortalAssetNotFound with the file name in the message. WorkDirFile.read_json() Wraps JSONDecodeError -> ValueError with the file name in the message. WorkDirFile.read_csv() Raises ImportError with an install hint if pandas is not available. DataPortalTask._get_access_context() Raises DataPortalAssetNotFound immediately when work_dir is empty rather than passing an invalid URI to S3Path. https://claude.ai/code/session_01BWBtQcWJkA7he7Ht5Vz5cu

- Add missing `ask` to interactive utils import (was causing NameError in _task_menu, _browse_files_menu, _file_menu at runtime) - Remove duplicate _print_task_debug_recursive call that was unreachable due to deduplication logic (dead code) - Replace redundant `import json as _json` inside _file_menu with module-level `json` already imported at line 1 - Replace local `from pathlib import PurePath` in _file_read_options with module-level `Path` (already imported, same .suffix interface) - Delete cirro/cli/interactive/debug_args.py — gather_debug_arguments() was never called from anywhere https://claude.ai/code/session_01BWBtQcWJkA7he7Ht5Vz5cu

controller.py: - Add missing `from typing import List, Optional, Set` (was causing NameError at import time; Optional used in _print_task_debug_recursive but never imported) - Move DataPortalDataset, find_primary_failed_task, convert_size, ask_project, and ask_dataset imports to module level (lazy-import-inside-function pattern is reserved for optional deps like pandas/anndata in this codebase) - Remove _format_size() thin wrapper; call convert_size() directly at each site (three-line wrapper for a one-liner it just delegates to) - Remove local `ask_project as _ask_project` / `ask_dataset as _ask_dataset` aliasing inside run_debug(); unnecessary with module-level imports - Fix _seen/_counter type annotations: set/list → Optional[Set[str]]/Optional[List[int]] - Remove redundant `_ = task.inputs` in else branch; the cached property is accessed directly by _print_task_debug_recursive's own loop task.py: - Add Any to typing imports - Make source_task a @Property (all externally-visible state follows this pattern throughout the SDK; plain public attribute was inconsistent) - Fix compression: str = None → Optional[str] = None in read() and readlines() - Fix read_json() return type object → Any (idiomatic for unknown JSON value) - Fix Args docstring format to match repo style: name (type): description https://claude.ai/code/session_01BWBtQcWJkA7he7Ht5Vz5cu

DataPortalTask was referenced in string annotations (Optional['DataPortalTask']) but never imported, causing pyflakes to report it as an undefined name. https://claude.ai/code/session_01BWBtQcWJkA7he7Ht5Vz5cu

… methods tests/test_task.py: - TestWorkDirFileName: name extracted from S3 URI - TestWorkDirFileSize: pre-populated, lazy head_object, S3 error raises - TestWorkDirFileRead: text, gzip, unsupported compression, readlines, read_json, S3 error propagation - TestWorkDirFileSourceTask: None by default, set at construction - TestWorkDirFileRepr: __str__ and __repr__ - TestDataPortalTaskProperties: task_id, name, status, hash, work_dir, exit_code (int, None for empty/dash) - TestDataPortalTaskWorkDirFiles: logs/script content and error fallback, outputs empty on error/missing workdir - TestDataPortalTaskInputs: URI parsing, source_task linking, empty on missing workdir, caching - TestDataPortalTaskRepr: __str__ and __repr__ tests/test_dataset_tasks.py: - TestDataPortalDatasetLogs: success, exception → empty string, empty log - TestDataPortalDatasetTasks: parsed from trace, cached, raises for non-Nextflow, empty trace, all_tasks_ref shared reference - TestDataPortalDatasetPrimaryFailedTask: finds failed task, None for non-Nextflow/no failures/empty trace, uses execution log for disambiguation https://claude.ai/code/session_01BWBtQcWJkA7he7Ht5Vz5cu

Findings from analog-based review (comparing each new file against its closest existing peer in the codebase): cirro/sdk/task.py: - Add __init__ docstrings with code examples to WorkDirFile and DataPortalTask (peer DataPortalFile.__init__ has this pattern) - Fix DataPortalTask Args format to name (type): description style - Add one-liner docstrings to _build_inputs() and _build_outputs() cirro/sdk/dataset.py: - Add Returns: sections to logs(), tasks, and primary_failed_task (peer methods in file.py and process.py have explicit Returns: entries) cirro/cli/controller.py: - Add -> None return type annotations to all private helper functions: _print_task_debug, _print_task_debug_recursive, _print_task_header, _task_menu, _browse_files_menu, _file_menu https://claude.ai/code/session_01BWBtQcWJkA7he7Ht5Vz5cu

…g additions Resolved conflicts in cirro/cli/controller.py, cirro/sdk/dataset.py, cirro/cli/cli.py, cirro/cli/models.py, and cirro/cli/__init__.py. Upstream introduced _init_cirro_client()/_get_projects() helpers, validate_folder/list_files commands, and read_files() on DataPortalDataset. Updated run_debug() to use the new _init_cirro_client() pattern. Added DebugArguments, debug CLI command, and task-related imports back on top of the upstream version. https://claude.ai/code/session_01BWBtQcWJkA7he7Ht5Vz5cu

DataPortalFile._get() calls client.file.get_file(file), not get_file_from_path. The wrong mock meant the trace content was never returned, causing StringIO to receive a Mock object and raise TypeError when _load_tasks() tried to parse the trace TSV. https://claude.ai/code/session_01BWBtQcWJkA7he7Ht5Vz5cu

https://claude.ai/code/session_01BWBtQcWJkA7he7Ht5Vz5cu

test_load_running makes anonymous S3 calls to a public bucket that may not be accessible in CI. Follow the same pattern used in test_config_load where integration tests that need external resources are skipped in CI. https://claude.ai/code/session_01BWBtQcWJkA7he7Ht5Vz5cu

… in CI - Fix _make_task helper: empty dict {} is falsy in Python, so 'trace_row or dict(TRACE_ROW)' would use TRACE_ROW instead of {}. Changed to 'trace_row if trace_row is not None else dict(TRACE_ROW)'. - Skip test_pipeline_definition_nextflow_without_schema in CI: nf-core upgraded from 3.3.2 to 3.5.1 and the schema generation output may differ from the expected fixture. The test requires Nextflow in PATH and produces version-specific output. https://claude.ai/code/session_01BWBtQcWJkA7he7Ht5Vz5cu

The SonarCloud step only runs on Python 3.14 and requires a SONAR_TOKEN secret that may not be configured in all environments. Adding continue-on-error: true so a missing/invalid token does not fail the lint-and-run-tests job. https://claude.ai/code/session_01BWBtQcWJkA7he7Ht5Vz5cu

This reverts commit a5aa087.

claude and others added 23 commits April 14, 2026 21:10

Fix flake8 F821: import DataPortalTask under TYPE_CHECKING in dataset.py

8813360

DataPortalTask was referenced in string annotations (Optional['DataPortalTask']) but never imported, causing pyflakes to report it as an undefined name. https://claude.ai/code/session_01BWBtQcWJkA7he7Ht5Vz5cu

Trigger CI run for test fix verification

5a513de

https://claude.ai/code/session_01BWBtQcWJkA7he7Ht5Vz5cu

Revert "Make SonarCloud scan non-blocking in CI (continue-on-error)"

9b1c3ad

This reverts commit a5aa087.

WIP

6527874

Merge branch 'main' into claude/add-workflow-debugging-pBw5J

1505d32

WIP

e98ccc4

Clean up

b7a0bf9

sminot closed this Apr 17, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add workflow debugging feature#201

Add workflow debugging feature#201
sminot wants to merge 23 commits intoCirroBio:mainfrom
sminot:claude/add-workflow-debugging-pBw5J

sminot commented Apr 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

sminot commented Apr 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants