Skip to content

fix: keep failed tasks resumable + clearer timeout error + quieter Feishu logs#29

Merged
hetaoBackend merged 1 commit into
mainfrom
fix/failed-task-resume-and-feishu-log-noise
Jun 5, 2026
Merged

fix: keep failed tasks resumable + clearer timeout error + quieter Feishu logs#29
hetaoBackend merged 1 commit into
mainfrom
fix/failed-task-resume-and-feishu-log-noise

Conversation

@hetaoBackend

Copy link
Copy Markdown
Owner

Summary

Four independent backend fixes, all found while tracing a Feishu bot reply that said "❌ Task #14 not found or has no saved session."

# Fix Symptom removed
1 Persist session_id on failed runs (taskboard.py) Failed task couldn't be resumed → "no saved session" when replying in a thread
2 On timeout task.error = "Task timed out after Ns" (taskboard.py) Error showed an unrelated stderr line ("Reading additional input from stdin…") instead of the real timeout
3 No-op processors for message_read / recalled receipts (feishu_channel.py) ERROR … processor not found, type: im.message.message_read_v1 on every read receipt
4 Lark logger propagate=False + log_level DEBUG→INFO (feishu_channel.py) Every lark log line printed twice

Root causes

  1. session_id was written only inside the if success: branch; the failure branch dropped it. Codex emits thread_id in the opening thread.started event, so even a started-then-failed run has a recoverable id.
  2. On timeout output="Task timed out after Ns", but the failure branch overwrote the summary via _extract_error_summary(raw_stderr, …), which picked a noisy codex stderr line. Also hoisted timed_out init before the try so the FileNotFoundError path no longer UnboundLocalErrors.
  3. The dispatcher registered handlers for receive / bot_added / reaction only; the app is also subscribed to message_read, so the SDK logged "processor not found" per receipt.
  4. The SDK's "Lark" logger has its own stdout handler and propagates to the root logger (configured by logging.basicConfig in taskboard.py) → two emissions per line.

Test plan

+5 tests via red→green TDD:

  • test_execute_task_codex_failure_still_persists_session_id
  • test_execute_task_claude_failure_still_persists_session_id
  • test_execute_task_timeout_error_summary_states_timeout_not_stderr
  • test_start_registers_readonly_event_noops
  • test_start_disables_lark_logger_propagation

make check: 824 passed, coverage 92.79% (gate 90%), ruff lint + format clean.

🤖 Generated with Claude Code

…hu logs

Four independent backend fixes surfaced while debugging a Feishu reply that
said "Task #14 not found or has no saved session".

- taskboard.py: persist the agent session_id (claude session / codex thread_id)
  even when a run FAILS. It was only saved on success, so a failed task could
  never be resumed (replying in a Feishu/Slack/Telegram thread hit "no saved
  session"). Codex emits thread_id in the opening thread.started event, so even
  a started-then-failed run is recoverable.

- taskboard.py: on timeout, use "Task timed out after Ns" as the task.error
  summary instead of letting _extract_error_summary surface an unrelated stderr
  line (e.g. codex's "Reading additional input from stdin..."). Also hoist the
  timed_out init before the try block so the failure branch can read it when
  Popen itself raises (CLI not found) before the timer is armed.

- channels/feishu_channel.py: register no-op processors for the message_read
  and recalled receipts we subscribe to but don't act on, silencing the SDK's
  "processor not found, type: im.message.message_read_v1" ERROR per receipt.

- channels/feishu_channel.py: set the "Lark" logger propagate=False so lines
  aren't emitted twice (its own handler + root basicConfig handler), and lower
  log_level DEBUG -> INFO to further cut noise.

Tests: +5 via red-green TDD; full suite 824 passed, coverage 92.79%.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@vercel

vercel Bot commented Jun 5, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
agentforge-landing Ready Ready Preview, Comment Jun 5, 2026 7:37am

@hetaoBackend hetaoBackend merged commit 685a198 into main Jun 5, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant