fix: surface agent timeout as state['error'] in CliAgentEnv by rasdani · Pull Request #1170 · PrimeIntellect-ai/verifiers

rasdani · 2026-04-17T19:49:26Z

Problem

CliAgentEnv.wait_for_completion catches asyncio.TimeoutError but only sets state["agent_timed_out"] = True, leaving state["error"] unset. When the agent times out without producing any trajectory, the rollout comes back with trajectory=[], error=None, and the orchestrator scheduler reschedules it as a generic "Empty trajectory" with no diagnostic — indistinguishable from a genuinely silent empty rollout.

Fix

In the asyncio.TimeoutError branch, also set state["error"] = AgentError(f"Agent timed out after {self.timeout_seconds}s"), mirroring the style of the except Exception branch directly below. agent_timed_out=True is preserved for existing downstream checks elsewhere in the file.

Context

Third in a series closing silent-failure holes in CliAgentEnv:

Fix CliAgentEnv stuck on dead tunnel #1127 — non-zero exit + empty trajectory
Fix AgentError double-wrapping in poll_job_completion #1130 — AgentError double-wrapping
this PR — timeout path

A companion fix in prime-rl reorders the scheduler's if empty / elif error so that the error branch is checked first (otherwise this surfaced error would still be hidden behind the empty-trajectory branch). See companion PR in prime-rl.

Tests

No regression test was added. The existing tests/test_cli_agent_env.py covers timeout_reached (the stop-condition) but does not exercise wait_for_completion's asyncio.TimeoutError path. Adding such a test would require mocking asyncio.wait_for and the background-job polling loop, which felt out of scope for a 3-line fix — flagged here for a follow-up if desired.

Checks

uv run ruff check verifiers/envs/experimental/cli_agent_env.py — passed
uv run ruff format --check verifiers/envs/experimental/cli_agent_env.py — already formatted

Note

Low Risk
Small, localized change to timeout handling that only affects how errors are surfaced in rollout state; minimal behavioral impact beyond improved diagnostics and potential downstream branching on state["error"].

Overview
CliAgentEnv.wait_for_completion now records timeouts as a real failure by setting state["error"] to an AgentError when asyncio.wait_for hits TimeoutError, in addition to the existing state["agent_timed_out"] = True flag.

This ensures timeout rollouts surface a diagnostic error instead of looking like a silent empty trajectory to downstream schedulers/consumers.

^{Reviewed by Cursor Bugbot for commit e88fd94. Bugbot is set up for automated code reviews on this repo. Configure here.}

When the agent background job exceeds timeout_seconds, wait_for_completion previously set only agent_timed_out=True and left state["error"] unset. Rollouts with no trajectory then returned error=None, which the orchestrator scheduler logs as a generic "Empty trajectory" with no root cause. Mirror the behavior of the Exception branch and set state["error"] to an AgentError naming the timeout so downstream consumers can distinguish a timed-out rollout from a silently empty one. agent_timed_out=True is kept for existing downstream checks. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

rasdani requested review from kcoopermiller and mikasenghaas April 17, 2026 19:58

rasdani merged commit b3a7255 into main Apr 18, 2026
6 checks passed

rasdani mentioned this pull request Apr 18, 2026

chore: v0.1.13.dev1 dev release #1175

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: surface agent timeout as state['error'] in CliAgentEnv#1170

fix: surface agent timeout as state['error'] in CliAgentEnv#1170
rasdani merged 1 commit intomainfrom
fix/cli-agent-timeout-surfaces-error

rasdani commented Apr 17, 2026 •

edited by cursor bot

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

rasdani commented Apr 17, 2026 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Fix

Context

Tests

Checks

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

rasdani commented Apr 17, 2026 •

edited by cursor bot

Loading