Skip to content

Feat/dashboard server#7

Merged
EdmondDantes merged 16 commits into
mainfrom
feat/dashboard-server
Jun 27, 2026
Merged

Feat/dashboard server#7
EdmondDantes merged 16 commits into
mainfrom
feat/dashboard-server

Conversation

@EdmondDantes

Copy link
Copy Markdown
Contributor

No description provided.

EdmondDantes and others added 16 commits June 27, 2026 04:46
Add GET /api/projects/{key}/runs/{id}/stream — replays the trace journal
past Last-Event-ID/?since, then DB-tails for new records, emitting one
`trace` SSE event per seq (interim poll; an in-process trace bus replaces
it next). Bump true-async/ide-helper to ^0.7.3 for the HttpResponse SSE
stubs (sseStart/sseEvent/sseComment/sseRetry).

docs/dashboard-server-plan.md captures the full endpoint/DB/engine design.
Move the run pipeline (env wiring, solver generate/reuse, run loop, supervisor
repair-and-resume) out of WorkflowMode into IssueRunner, so both `claw run` and
the dashboard server can drive an issue. The console-vs-server differences are
injected, leaving the pipeline I/O-free:
  - human    ask-channel human tier behind the supervisor (console | HTTP gate);
  - approve  run-the-generated-solver decision (show + confirm | gate | auto);
  - report   one progress/error line (console writes | server discards);
  - liveSink extra live trace sink (console tree | none — dashboard tails the db).

WorkflowMode::runIssue now resolves the issue/config/agent and delegates. No
behaviour change for the CLI; also wires the `serve` subcommand to Server.
start spawns the issue's IssueRunner as a detached coroutine in a server-owned
Async\Scope and returns 202 at once; the dashboard watches progress on the run
stream. One active run per issue (concurrent start -> 409). answer delivers the
human's reply to a run parked at its gate (only while WaitingHuman, else 409).

The gate is HttpGateSpeaker, the ask channel's human tier: it records the
question via the run's tracer (so the dashboard shows the gate and a chat row),
flips the issue to WaitingHuman, and parks the run on an unbuffered Async\Channel
until answer sends the reply, which it records and returns. Durable record (the
question/answer trace rows) and live wakeup (the channel) are split, so a restart
keeps the gate visible and the run resumes from its snapshot.

Tracer gains question()/answer(); IssueRunner takes the human tier as a
\Closure(Tracer): SpeakerInterface factory, since the gate records through the
tracer the runner builds. Verified live: start->202, double-start->409, the run
spawns, flips InProgress and writes trace.
…g before any work

A generated solver put $this->tool('done') in its validate step (the 'no
conflict' branch), which ends the WHOLE run — so it finished having created
nothing, yet the issue went Done. The draft prompt actively invited this
('call done to finish early and skip the rest') without distinguishing 'step
finished' from 'task solved'.

Rewrite the done rule: done ends the whole run and means the deliverable exists
and is verified; never hardcode it in an early/validate step or a PHP branch —
only the model may reach it inside an ai() exchange after the work. Add the same
as a reviewer reject-criterion so a premature done is caught before the solver
runs.
The run stream replayed the journal then DB-tailed on a 250ms timer. Now a run's
TraceStore is followed by a LiveTraceSink that publishes each persisted record to
an in-process TraceBus (one topic per run id); the SSE handler subscribes and is
pushed to. recv blocks until an event or a ~10s heartbeat tick (which re-checks
for client disconnect), so an idle stream costs nothing and a live event arrives
at once instead of up to a tick late.

LiveTraceSink reads lastInsertId() of the row TraceStore just wrote on the same
connection — that's the db seq the dashboard resumes on — and formats the row
exactly as Server::trace() does, so a pushed row is indistinguishable from a
replayed one. publish() is non-blocking (sendAsync), so a slow subscriber can't
stall the run; a dropped event shows up as a seq gap the handler heals from the
db. Verified live: 66 pushed events == 66 db rows, arriving incrementally across
a real run.
GET /api/projects/{key}/issues/stream emits an `issue` event per issue whose
snapshot changed: the full board on connect (the seen-set starts empty), then
diffs. Unlike the run stream this polls the issue snapshot on a ~2s tick rather
than subscribing to the bus — the board is low-frequency (a card changing column,
a token tick) and re-deriving a handful of issues is cheap; the hot per-record
path is the run stream, which is the one that needed push.

Verified live: 15 issues on connect, then a single issue#4=closed event within a
tick of its status flipping. Refresh the class docblock — the API now also writes
(start/answer) and the run stream pushes.
The old doc described the original Telegram chat bot. Rewrite it to the current
architecture, drawn from docs/ and verified against the code: claw as a per-issue
autonomous solver (a workflow that writes a workflow), the IssueRunner run engine,
workflows (steps, critics, the supervisor/human ask channel, snapshot resume,
handoffs), agents, the run-path tools, tracing, and the full dashboard server
(SSE push run stream + polled board stream, start/answer, the human gate). Keeps
the honest divergences: gemini unwired, the autonomous run executor has no
permission/timeout/audit middleware, the chat path is legacy.
…ories, \Exception

Apply review feedback across the run engine and dashboard server:

- catch \Exception, never \Throwable (a cancellation is not an Exception, so it
  propagates for free); the SSE heartbeat catches AsyncCancellation and rethrows a
  real \Cancellation. The run is spawned with Async\spawn (no managed scope) and
  cleaned up in a bare try/finally.
- IssueRunner takes one RunFrontendInterface (ConsoleRunFrontend / HttpRunFrontend)
  instead of three \Closure seams; tool wiring moves to ToolFactory; agent creation
  moves to Agent\AgentFactory; the default system prompt moves to Config — so neither
  the runner nor the server reaches into the CLI layer.
- Server uses the built-in HttpResponse::json(), drops the hand-rolled JSON helper,
  echoes its banner, and drops the now-needless parens around new.
- IssueStatus is string-backed (value = the dashboard's lowercase form), so the API
  serializes a status straight from the enum and the uiStatus() map is gone.
- Clearer variable names, comments and lines kept within 120 columns.

Verified live: json() flushes without end(), the spawned run survives the 202 and
writes trace, statuses serialize lowercase. PHPStan clean, 192 tests pass.
…re-run

A critic re-run no longer cold-restarts the model: ai() now feeds the prior
attempt's conversation as $prior, so the model keeps everything it already did
and reacts to the critique instead of re-deriving the step from scratch.

Add back(string $toStep, string $reason): a step can send the run back to an
earlier step (e.g. a review sending work back to where it was produced — the
right model, not an inline rewrite). The default run() is back-aware: it re-runs
target..current, the re-entered step CONTINUES its own conversation and reads the
reason as first-attempt guidance. Journaled via Tracer::back (event 'back' with
from/to/reason). Tested: a step back()s once, the driver re-runs the range and
the jump is in the journal.
…es Throwable

IssueRunner (and RunContext) are the shared run engine, not a CLI concern — move
them out of Claw\Cli into Claw\Run beside the run front-ends.

The repair boundary runs LLM-generated solver code, which throws Error (TypeError,
ParseError, unknown class) as readily as Exception — so generate/run/repair catch
\Throwable to repair-and-resume. But a cancellation is also a Throwable and must
never be 'repaired': each catches \Cancellation first and rethrows it, then
\Throwable.
… pass

draft is now critic-gated (#[Step(critic: 'solverReview')]) and returns the code;
the separate review() step and its one-shot reviseCode() are gone. So a rejected
draft RE-RUNS draft — continuing the same conversation (the model keeps its prior
attempt) and reading the findings as guidance — and is RE-JUDGED each round. That
closes the gap where the worker's fix was saved unreviewed (how a 'done in
validate' solver slipped through). The solverReview rubric carries the same bar
(does real work, no premature done, recipe genuinely carried out).
… subscribers by spl_object_id

Make LiveTraceSink, HttpGateSpeaker, ConsoleRunFrontend, HttpRunFrontend and
IssueRunner `final readonly class` (all their properties are immutable), dropping
the per-property readonly. TraceBus drops its hand-rolled subscriber counter and
keys subscribers by spl_object_id($channel) — the channel is its own identity.
…he typed record

LiveTraceSink used to take a raw \PDO only to read lastInsertId(), reach past the
TraceStore for the seq, and hand the bus a pre-formatted wire array — three smells.

Now it COMPOSES the TraceStore: write() persists through it, then asks it for the
seq (new TraceStore::lastSeq()) and publishes the typed TraceRecordInterface plus
that seq. The bus carries [record, seq], not a wire array; the wire formatting
moves to the SSE edge (Server::liveRow), where Server::trace() already owns it. So
there is one persist-and-publish sink (no [TraceStore, LiveTraceSink] ordering to
get right): the run front-end now returns the whole sink list (RunFrontendInterface
::traceSinks) — console = [TraceStore, ConsoleTraceSink], http = [LiveTraceSink].

Verified live: 49 streamed events == 49 db rows (persists once), seq consistent.
Add blank_line_before_statement (if/for/foreach/while/switch/do/try/return/throw/
break/continue/yield) to .php-cs-fixer.dist.php, so there is always a blank line
after a block's closing } before the next statement. Apply it across the codebase
(whitespace only).
…use handles

The HTTP server opened a raw \PDO per request and ran raw SQL (issues/runs/trace/
artifacts), bypassing the data layer. Move those reads where they belong:
ProjectStore gains all()/openByKey()/allIssues()/runsFor() (and busy_timeout in
its opener), TraceReader gains tail()/tokens()/artifactRecords(). The Server now
caches one read handle + TraceReader per project key (opened once, reused — no
per-request connection churn) and only routes + assembles the wire shape; a run
still gets its own fresh writable handle. Also drop the cryptic 'hb' SSE comment
for the canonical empty-comment heartbeat.

Verified live: /projects, /issues (status/done/tokens/runs/artifacts), trace and
the run stream all return the same data; 49 streamed == 49 db rows.
@EdmondDantes EdmondDantes merged commit 6d086db into main Jun 27, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant