Add idle_output_timeout as Per-Step Recipe Override#793
Conversation
…parser Add idle_output_timeout: int | None = None to RecipeStep dataclass, register it in _PARSE_STEP_HANDLED_FIELDS, and thread it through _parse_step() via data.get(). 0 means disabled for this step; None means fall back to global config.
…ist entry Validate that idle_output_timeout is a non-negative integer when set (0 is valid and means disabled). Add idle_output_timeout to the run_skill tool parameter whitelist so the recipe orchestrator can pass it as a kwarg.
Extend HeadlessExecutor.run() protocol signature with optional idle_output_timeout: float | None = None parameter, matching the same pattern as stale_threshold.
Add idle_output_timeout: int | None = None parameter to run_skill(), convert to float and pass to executor.run(). The None passthrough is intentional; 0→None (disabled) conversion happens in the executor layer.
… layers Add idle_output_timeout: float | None = None to run_headless_core() and DefaultHeadlessExecutor.run(). Both layers apply the same 0→None resolution (raw=0 means disabled, maps to None for run_managed_async). When not provided, falls back to float(cfg.idle_output_timeout).
…h steps Add idle_output_timeout: 0 (disabled) to implement_phase, troubleshoot_implement_failure, run_experiment, adjust_experiment, and re_run_experiment. These steps use skills that may produce no stdout for long periods (LLM thinking, compilation, experiments).
T1: RecipeStep parses idle_output_timeout from YAML (120, None, 0) T2: Validator rejects negative values, accepts 0 and positive T3: HeadlessExecutor protocol includes idle_output_timeout param T4/T5: run_skill passes idle_output_timeout to executor (120→120.0, None→None) T6: run_headless_core applies 0→None conversion and cfg fallback Also update MockExecutor signatures in existing server tests.
Reformatted rules_tools.py, tools_execution.py, and test_context.py to comply with ruff line-length and style rules.
Trecek
left a comment
There was a problem hiding this comment.
AutoSkillit PR Review — Verdict: changes_requested
Found 14 actionable findings (10 critical, 4 warning). See inline comments.
| tool_ctx.runner.push(self._success_payload(marker)) | ||
| await run_headless_core( | ||
| "/investigate foo", cwd="/tmp", ctx=tool_ctx, idle_output_timeout=None | ||
| ) |
There was a problem hiding this comment.
[critical] tests: Same fragile call_args_list 4-tuple unpacking issue as L3599.
There was a problem hiding this comment.
Investigated — this is intentional. Same as L3599 — MockSubprocessRunner stores explicit 4-tuples, documented in conftest.py:49. Unpacking is correct.
| ) -> None: | ||
| """idle_output_timeout=None falls back to float(cfg.idle_output_timeout).""" | ||
| from autoskillit.execution.headless import run_headless_core | ||
|
|
There was a problem hiding this comment.
[warning] tests: Tautological assertion: cfg_value is read from the live config then asserted equal to the result. If the implementation simply reads the same config field, this always passes regardless of correctness. Assert against a known constant (e.g. == 600.0) to make the test meaningful.
There was a problem hiding this comment.
Agreed — fixed in e8e772a. Replaced cfg_value = float(tool_ctx.config.run_skill.idle_output_timeout) reference with a direct assertion against 600.0 (the known config default per defaults.yaml:47 and settings.py:96). Test now catches implementations that ignore the config entirely.
| _raw_idle = ( | ||
| idle_output_timeout | ||
| if idle_output_timeout is not None | ||
| else float(cfg.idle_output_timeout) |
There was a problem hiding this comment.
[warning] defense: float(cfg.idle_output_timeout) is called unconditionally with no error context. If dynaconf surfaces this as a string via an env-var override, the resulting ValueError is uncontextualised. Validate the type in the config dataclass or wrap with a descriptive try/except.
There was a problem hiding this comment.
Investigated — this is intentional. settings.py:402 performs idle_output_timeout=int(val(...)) at config load time, coercing env-var overrides before they reach headless.py. By the time float(cfg.idle_output_timeout) is called at headless.py:995, the value is a validated int. An env-var string like 'abc' would raise ValueError at config load (settings.py), not here.
| expected_output_patterns=expected_output_patterns, | ||
| write_behavior=write_spec, | ||
| stale_threshold=float(stale_threshold) if stale_threshold is not None else None, | ||
| idle_output_timeout=float(idle_output_timeout) |
There was a problem hiding this comment.
[warning] defense: float(idle_output_timeout) is called without guarding against non-numeric input. FastMCP deserialises from JSON; an unexpected type raises a confusing TypeError instead of a clean validation error. Add an isinstance guard or try/except with a descriptive message.
There was a problem hiding this comment.
Investigated — this is intentional. Parameter typed as int | None at tools_execution.py:154. FastMCP validates JSON schema types before invoking the function body; non-numeric inputs are rejected at the MCP transport layer with a schema error, never reaching the float() call.
…ltHeadlessExecutor.run() DefaultHeadlessExecutor.run() was pre-resolving idle_output_timeout (0.0→None) before passing it to run_headless_core(). run_headless_core() then treated None as "not supplied" and fell back to float(cfg.idle_output_timeout), re-enabling the very timeout the step intended to disable. Fix: pass raw idle_output_timeout directly to run_headless_core(), which is the single authoritative resolution layer. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…put_timeout fallback test The previous assertion read float(cfg.idle_output_timeout) from the same config the implementation reads, making the test tautological — it would pass even if the implementation used an arbitrary config field. Assert the known default 600.0 directly. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Summary
Thread
idle_output_timeoutthrough the same per-step override chain thatstale_thresholdalready uses. Add the field to
RecipeStep, parse it from YAML, validate it (0 = disabled,negative = error), whitelist it in
rules_tools.py, add it to theHeadlessExecutorprotocol,thread it through
tools_execution.pyand bothDefaultHeadlessExecutor.run()andrun_headless_core(), then setidle_output_timeout: 0(disabled) on the five compute-heavysteps in
research.yaml.The key semantic difference from
stale_threshold:0is a valid recipe value meaning"no idle timeout for this step" (maps to
Noneinrun_managed_async's API), whereasstale_thresholdrejects 0 as invalid. The 0→None conversion happens at the resolutionlayer (both
DefaultHeadlessExecutor.run()andrun_headless_core()).Architecture Impact
Process Flow Diagram
%%{init: {'flowchart': {'nodeSpacing': 45, 'rankSpacing': 55, 'curve': 'basis'}}}%% flowchart TB %% CLASS DEFINITIONS %% classDef cli fill:#1a237e,stroke:#7986cb,stroke-width:2px,color:#fff; classDef stateNode fill:#004d40,stroke:#4db6ac,stroke-width:2px,color:#fff; classDef handler fill:#e65100,stroke:#ffb74d,stroke-width:2px,color:#fff; classDef phase fill:#6a1b9a,stroke:#ba68c8,stroke-width:2px,color:#fff; classDef newComponent fill:#2e7d32,stroke:#81c784,stroke-width:2px,color:#fff; classDef output fill:#00695c,stroke:#4db6ac,stroke-width:2px,color:#fff; classDef detector fill:#b71c1c,stroke:#ef5350,stroke-width:2px,color:#fff; classDef gap fill:#ff6f00,stroke:#ffa726,stroke-width:2px,color:#000; classDef terminal fill:#1a237e,stroke:#7986cb,stroke-width:2px,color:#fff; %% TERMINALS %% START([START: Recipe YAML loaded]) COMPLETE_WATCH([COMPLETE: idle watchdog active]) COMPLETE_NO_WATCH([COMPLETE: no idle watchdog]) IDLE_STALL([IDLE_STALL: process killed]) subgraph RecipeLayer ["Recipe Layer — ● schema.py · ● io.py · ● validator.py · ● rules_tools.py"] direction TB ParseStep["● _parse_step()<br/>━━━━━━━━━━<br/>recipe/io.py<br/>data.get('idle_output_timeout')"] StepSchema["● RecipeStep<br/>━━━━━━━━━━<br/>idle_output_timeout: int | None<br/>None = use global cfg<br/>0 = disabled for this step"] Validate["● validate_recipe()<br/>━━━━━━━━━━<br/>recipe/validator.py<br/>must be int ≥ 0, or None"] RulesTools["● _TOOL_PARAMS[run_skill]<br/>━━━━━━━━━━<br/>recipe/rules_tools.py<br/>idle_output_timeout registered<br/>as first-class MCP param"] end subgraph MCPLayer ["MCP Layer — ● tools_execution.py"] direction TB RunSkill["● run_skill MCP tool<br/>━━━━━━━━━━<br/>server/tools_execution.py<br/>idle_output_timeout: int | None<br/>→ float(v) if v is not None else None"] end subgraph ExecutorLayer ["Executor Layer — ● _type_protocols.py · ● headless.py"] direction TB Protocol["● HeadlessExecutor.run()<br/>━━━━━━━━━━<br/>core/_type_protocols.py<br/>idle_output_timeout: float | None"] DefaultExec["● DefaultHeadlessExecutor.run()<br/>━━━━━━━━━━<br/>execution/headless.py<br/>_raw = val ?? float(cfg.idle_output_timeout=600)"] ZeroGate{"_raw_idle > 0.0?<br/>━━━━━━━━━━<br/>0 → None gate"} end subgraph CoreExec ["Core Execution — ● headless.py · process.py"] direction TB HeadlessCore["● run_headless_core()<br/>━━━━━━━━━━<br/>execution/headless.py<br/>passes effective_idle to runner<br/>(0→None guard is no-op here)"] ManagedAsync["run_managed_async()<br/>━━━━━━━━━━<br/>execution/process.py<br/>idle_output_timeout: float | None"] WatchDecision{"idle_output_timeout<br/>is not None?<br/>━━━━━━━━━━<br/>start watchdog?"} WatchStdout["_watch_stdout_idle()<br/>━━━━━━━━━━<br/>anyio task group coroutine<br/>polls stdout for idle time"] end %% FLOW %% START --> ParseStep ParseStep --> StepSchema StepSchema --> Validate Validate -->|"valid: int ≥ 0 or None"| RulesTools Validate -->|"invalid: raises error"| IDLE_STALL RulesTools --> RunSkill RunSkill --> Protocol Protocol --> DefaultExec DefaultExec --> ZeroGate ZeroGate -->|"yes → effective_idle: float"| HeadlessCore ZeroGate -->|"no (0.0) → effective_idle: None"| HeadlessCore HeadlessCore --> ManagedAsync ManagedAsync --> WatchDecision WatchDecision -->|"not None"| WatchStdout WatchDecision -->|"None (disabled)"| COMPLETE_NO_WATCH WatchStdout -->|"idle < threshold"| COMPLETE_WATCH WatchStdout -->|"idle ≥ threshold"| IDLE_STALL %% CLASS ASSIGNMENTS %% class START terminal; class COMPLETE_WATCH,COMPLETE_NO_WATCH output; class IDLE_STALL detector; class StepSchema stateNode; class ParseStep,RulesTools phase; class Validate detector; class RunSkill handler; class Protocol phase; class DefaultExec,HeadlessCore,ManagedAsync handler; class ZeroGate,WatchDecision stateNode; class WatchStdout phase;Data Lineage Diagram
%%{init: {'flowchart': {'nodeSpacing': 50, 'rankSpacing': 65, 'curve': 'basis'}}}%% flowchart LR %% CLASS DEFINITIONS %% classDef cli fill:#1a237e,stroke:#7986cb,stroke-width:2px,color:#fff; classDef stateNode fill:#004d40,stroke:#4db6ac,stroke-width:2px,color:#fff; classDef handler fill:#e65100,stroke:#ffb74d,stroke-width:2px,color:#fff; classDef phase fill:#6a1b9a,stroke:#ba68c8,stroke-width:2px,color:#fff; classDef output fill:#00695c,stroke:#4db6ac,stroke-width:2px,color:#fff; classDef detector fill:#b71c1c,stroke:#ef5350,stroke-width:2px,color:#fff; classDef integration fill:#c62828,stroke:#ef9a9a,stroke-width:2px,color:#fff; subgraph Origins ["Data Origins"] direction TB DEFAULTS["defaults.yaml<br/>━━━━━━━━━━<br/>run_skill:<br/> idle_output_timeout: 600"] RESEARCH["● research.yaml<br/>━━━━━━━━━━<br/>5 steps: idle_output_timeout: 0<br/>implement / troubleshoot / experiment"] end subgraph GlobalCfg ["Global Config (Fallback Source of Truth)"] SETTINGS["settings.py RunSkillConfig<br/>━━━━━━━━━━<br/>idle_output_timeout: int = 600<br/>loaded from defaults.yaml"] end subgraph SchemaValidation ["● Schema, Parse & Validation"] direction TB PARSE["● io.py _parse_step()<br/>━━━━━━━━━━<br/>data.get('idle_output_timeout')<br/>raw int | None from YAML"] RECIPE_STEP["● schema.py RecipeStep<br/>━━━━━━━━━━<br/>idle_output_timeout: int | None<br/>None = use global; 0 = disabled"] VALIDATOR["● validator.py<br/>━━━━━━━━━━<br/>non-negative int required when set<br/>0 allowed (explicit disable)"] RULES["● rules_tools.py<br/>━━━━━━━━━━<br/>idle_output_timeout whitelisted<br/>in per-step field set (lint-time)"] end subgraph ExecPipeline ["● Execution Pipeline"] direction TB PROTOCOL["● _type_protocols.py<br/>━━━━━━━━━━<br/>HeadlessExecutor.run() protocol<br/>idle_output_timeout: float | None"] RUN_SKILL["● tools_execution.run_skill()<br/>━━━━━━━━━━<br/>int | None param<br/>→ float | None cast before executor"] HEADLESS["● HeadlessExecutor.run()<br/>━━━━━━━━━━<br/>None → fallback to cfg.idle_output_timeout<br/>0.0 → None (disabled)"] CORE["● run_headless_core()<br/>━━━━━━━━━━<br/>effective_idle: float | None<br/>passes to subprocess runner"] end subgraph Watchdog ["Process Watchdog (Consumption)"] direction TB MANAGED["run_managed_async()<br/>━━━━━━━━━━<br/>spawns _watch_stdout_idle task<br/>only when value is positive float"] WATCH["_watch_stdout_idle()<br/>━━━━━━━━━━<br/>polls stdout file size every 5s<br/>fires if size stalls ≥ timeout"] KILL["async_kill_process_tree()<br/>━━━━━━━━━━<br/>IDLE_STALL → kill headless Claude<br/>acc.idle_stall = True"] end %% PRIMARY DATA FLOWS %% DEFAULTS -->|"dynaconf load"| SETTINGS RESEARCH -->|"YAML parse"| PARSE PARSE -->|"RecipeStep(idle_output_timeout=...)"| RECIPE_STEP RECIPE_STEP -->|"validate at load time"| VALIDATOR RECIPE_STEP -->|"step.idle_output_timeout"| RUN_SKILL SETTINGS -->|"fallback when None"| HEADLESS RUN_SKILL -->|"float | None"| HEADLESS HEADLESS -->|"effective_idle: float | None"| CORE CORE -->|"runner(..., idle_output_timeout=...)"| MANAGED MANAGED -->|"tg.start_soon()"| WATCH WATCH -->|"trigger.set() on expiry"| KILL %% SECONDARY / STRUCTURAL FLOWS %% RECIPE_STEP -.->|"lint-time field whitelist"| RULES PROTOCOL -.->|"declares interface"| HEADLESS %% CLASS ASSIGNMENTS %% class DEFAULTS,RESEARCH cli; class SETTINGS,RECIPE_STEP,PROTOCOL stateNode; class PARSE,RUN_SKILL,MANAGED handler; class HEADLESS,CORE phase; class VALIDATOR,RULES,WATCH detector; class KILL integration;Operational Diagram
%%{init: {'flowchart': {'nodeSpacing': 50, 'rankSpacing': 60, 'curve': 'basis'}}}%% flowchart TB %% CLASS DEFINITIONS %% classDef cli fill:#1a237e,stroke:#7986cb,stroke-width:2px,color:#fff; classDef stateNode fill:#004d40,stroke:#4db6ac,stroke-width:2px,color:#fff; classDef handler fill:#e65100,stroke:#ffb74d,stroke-width:2px,color:#fff; classDef phase fill:#6a1b9a,stroke:#ba68c8,stroke-width:2px,color:#fff; classDef output fill:#00695c,stroke:#4db6ac,stroke-width:2px,color:#fff; classDef detector fill:#b71c1c,stroke:#ef5350,stroke-width:2px,color:#fff; classDef terminal fill:#1a237e,stroke:#7986cb,stroke-width:2px,color:#fff; %% ── CONFIGURATION SOURCES ── %% subgraph Config ["CONFIGURATION (read-only inputs)"] direction TB GLOBAL_CFG["RunSkillConfig<br/>━━━━━━━━━━<br/>idle_output_timeout = 600<br/>(global default, seconds)<br/>config/settings.py"] RESEARCH_YAML["● research.yaml<br/>━━━━━━━━━━<br/>5 compute-heavy steps:<br/>implement_phase<br/>run_experiment<br/>re_run_experiment<br/>adjust_experiment<br/>troubleshoot_implement_failure<br/>→ idle_output_timeout: 0"] end %% ── RECIPE PIPELINE ── %% subgraph RecipePipeline ["● RECIPE PIPELINE (parse → validate → schema)"] direction TB PARSE["● recipe/io.py<br/>━━━━━━━━━━<br/>_parse_step()<br/>data.get('idle_output_timeout')<br/>→ int | None"] SCHEMA["● recipe/schema.py<br/>━━━━━━━━━━<br/>RecipeStep.idle_output_timeout<br/>type: int | None<br/>default: None<br/>None = use global cfg<br/>0 = disabled for this step"] VALIDATE["● recipe/validator.py<br/>━━━━━━━━━━<br/>must be non-negative int or None<br/>negative values rejected<br/>0 explicitly permitted"] RULES["● recipe/rules_tools.py<br/>━━━━━━━━━━<br/>registered as known run_skill param<br/>prevents dead-with-param false positive"] end %% ── MCP SERVER ── %% subgraph MCPServer ["● MCP SERVER — run_skill tool"] direction TB RUN_SKILL["● tools_execution.py<br/>━━━━━━━━━━<br/>run_skill(idle_output_timeout: int | None)<br/>None → pass None (executor falls back to global)<br/>int → float(value) passed to executor<br/>0 → float(0.0) → executor disables timeout"] end %% ── PROTOCOL BOUNDARY ── %% subgraph Protocol ["● PROTOCOL BOUNDARY"] direction TB PROTOCOL["● _type_protocols.py<br/>━━━━━━━━━━<br/>HeadlessExecutor.run(<br/> idle_output_timeout: float | None<br/>)<br/>None = use global cfg"] end %% ── EXECUTION LAYER ── %% subgraph Execution ["● EXECUTION LAYER"] direction TB EXECUTOR["● execution/headless.py<br/>━━━━━━━━━━<br/>DefaultHeadlessExecutor.run()<br/>Resolution:<br/> None → float(cfg.idle_output_timeout)<br/> 0.0 → effective_idle = None (disabled)<br/> >0.0 → effective_idle = value"] CORE["● run_headless_core()<br/>━━━━━━━━━━<br/>Receives pre-resolved float | None<br/>Passes effective_idle to runner<br/>None = no idle kill signal"] end %% ── SUBPROCESS ── %% subgraph Subprocess ["SUBPROCESS RUNNER"] direction TB RUNNER["Process Runner<br/>━━━━━━━━━━<br/>idle_output_timeout=effective_idle<br/>None → timeout disabled<br/>600.0 → kill after 600s idle<br/>custom float → step override active"] end %% ── FLOWS ── %% RESEARCH_YAML -->|"idle_output_timeout: 0 (YAML field)"| PARSE PARSE --> SCHEMA SCHEMA --> VALIDATE SCHEMA --> RULES VALIDATE -->|"valid RecipeStep.idle_output_timeout"| RUN_SKILL GLOBAL_CFG -->|"fallback default (600s)"| EXECUTOR RUN_SKILL -->|"int|None → float|None"| PROTOCOL PROTOCOL --> EXECUTOR EXECUTOR -->|"resolved float|None"| CORE CORE --> RUNNER %% CLASS ASSIGNMENTS %% class GLOBAL_CFG phase; class RESEARCH_YAML cli; class PARSE,SCHEMA handler; class VALIDATE,RULES detector; class RUN_SKILL handler; class PROTOCOL phase; class EXECUTOR handler; class CORE handler; class RUNNER output;Closes #785
Implementation Plan
Plan file:
/home/talon/projects/autoskillit-runs/impl-785-20260412-214909-105758/.autoskillit/temp/make-plan/idle_output_timeout_per_step_plan_2026-04-12_215200.md🤖 Generated with Claude Code via AutoSkillit
Token Usage Summary