Feat/dashboard server#7
Merged
Merged
Conversation
Add GET /api/projects/{key}/runs/{id}/stream — replays the trace journal
past Last-Event-ID/?since, then DB-tails for new records, emitting one
`trace` SSE event per seq (interim poll; an in-process trace bus replaces
it next). Bump true-async/ide-helper to ^0.7.3 for the HttpResponse SSE
stubs (sseStart/sseEvent/sseComment/sseRetry).
docs/dashboard-server-plan.md captures the full endpoint/DB/engine design.
Move the run pipeline (env wiring, solver generate/reuse, run loop, supervisor repair-and-resume) out of WorkflowMode into IssueRunner, so both `claw run` and the dashboard server can drive an issue. The console-vs-server differences are injected, leaving the pipeline I/O-free: - human ask-channel human tier behind the supervisor (console | HTTP gate); - approve run-the-generated-solver decision (show + confirm | gate | auto); - report one progress/error line (console writes | server discards); - liveSink extra live trace sink (console tree | none — dashboard tails the db). WorkflowMode::runIssue now resolves the issue/config/agent and delegates. No behaviour change for the CLI; also wires the `serve` subcommand to Server.
start spawns the issue's IssueRunner as a detached coroutine in a server-owned Async\Scope and returns 202 at once; the dashboard watches progress on the run stream. One active run per issue (concurrent start -> 409). answer delivers the human's reply to a run parked at its gate (only while WaitingHuman, else 409). The gate is HttpGateSpeaker, the ask channel's human tier: it records the question via the run's tracer (so the dashboard shows the gate and a chat row), flips the issue to WaitingHuman, and parks the run on an unbuffered Async\Channel until answer sends the reply, which it records and returns. Durable record (the question/answer trace rows) and live wakeup (the channel) are split, so a restart keeps the gate visible and the run resumes from its snapshot. Tracer gains question()/answer(); IssueRunner takes the human tier as a \Closure(Tracer): SpeakerInterface factory, since the gate records through the tracer the runner builds. Verified live: start->202, double-start->409, the run spawns, flips InProgress and writes trace.
…g before any work
A generated solver put $this->tool('done') in its validate step (the 'no
conflict' branch), which ends the WHOLE run — so it finished having created
nothing, yet the issue went Done. The draft prompt actively invited this
('call done to finish early and skip the rest') without distinguishing 'step
finished' from 'task solved'.
Rewrite the done rule: done ends the whole run and means the deliverable exists
and is verified; never hardcode it in an early/validate step or a PHP branch —
only the model may reach it inside an ai() exchange after the work. Add the same
as a reviewer reject-criterion so a premature done is caught before the solver
runs.
The run stream replayed the journal then DB-tailed on a 250ms timer. Now a run's TraceStore is followed by a LiveTraceSink that publishes each persisted record to an in-process TraceBus (one topic per run id); the SSE handler subscribes and is pushed to. recv blocks until an event or a ~10s heartbeat tick (which re-checks for client disconnect), so an idle stream costs nothing and a live event arrives at once instead of up to a tick late. LiveTraceSink reads lastInsertId() of the row TraceStore just wrote on the same connection — that's the db seq the dashboard resumes on — and formats the row exactly as Server::trace() does, so a pushed row is indistinguishable from a replayed one. publish() is non-blocking (sendAsync), so a slow subscriber can't stall the run; a dropped event shows up as a seq gap the handler heals from the db. Verified live: 66 pushed events == 66 db rows, arriving incrementally across a real run.
GET /api/projects/{key}/issues/stream emits an `issue` event per issue whose
snapshot changed: the full board on connect (the seen-set starts empty), then
diffs. Unlike the run stream this polls the issue snapshot on a ~2s tick rather
than subscribing to the bus — the board is low-frequency (a card changing column,
a token tick) and re-deriving a handful of issues is cheap; the hot per-record
path is the run stream, which is the one that needed push.
Verified live: 15 issues on connect, then a single issue#4=closed event within a
tick of its status flipping. Refresh the class docblock — the API now also writes
(start/answer) and the run stream pushes.
The old doc described the original Telegram chat bot. Rewrite it to the current architecture, drawn from docs/ and verified against the code: claw as a per-issue autonomous solver (a workflow that writes a workflow), the IssueRunner run engine, workflows (steps, critics, the supervisor/human ask channel, snapshot resume, handoffs), agents, the run-path tools, tracing, and the full dashboard server (SSE push run stream + polled board stream, start/answer, the human gate). Keeps the honest divergences: gemini unwired, the autonomous run executor has no permission/timeout/audit middleware, the chat path is legacy.
…ories, \Exception Apply review feedback across the run engine and dashboard server: - catch \Exception, never \Throwable (a cancellation is not an Exception, so it propagates for free); the SSE heartbeat catches AsyncCancellation and rethrows a real \Cancellation. The run is spawned with Async\spawn (no managed scope) and cleaned up in a bare try/finally. - IssueRunner takes one RunFrontendInterface (ConsoleRunFrontend / HttpRunFrontend) instead of three \Closure seams; tool wiring moves to ToolFactory; agent creation moves to Agent\AgentFactory; the default system prompt moves to Config — so neither the runner nor the server reaches into the CLI layer. - Server uses the built-in HttpResponse::json(), drops the hand-rolled JSON helper, echoes its banner, and drops the now-needless parens around new. - IssueStatus is string-backed (value = the dashboard's lowercase form), so the API serializes a status straight from the enum and the uiStatus() map is gone. - Clearer variable names, comments and lines kept within 120 columns. Verified live: json() flushes without end(), the spawned run survives the 202 and writes trace, statuses serialize lowercase. PHPStan clean, 192 tests pass.
…re-run A critic re-run no longer cold-restarts the model: ai() now feeds the prior attempt's conversation as $prior, so the model keeps everything it already did and reacts to the critique instead of re-deriving the step from scratch. Add back(string $toStep, string $reason): a step can send the run back to an earlier step (e.g. a review sending work back to where it was produced — the right model, not an inline rewrite). The default run() is back-aware: it re-runs target..current, the re-entered step CONTINUES its own conversation and reads the reason as first-attempt guidance. Journaled via Tracer::back (event 'back' with from/to/reason). Tested: a step back()s once, the driver re-runs the range and the jump is in the journal.
…es Throwable IssueRunner (and RunContext) are the shared run engine, not a CLI concern — move them out of Claw\Cli into Claw\Run beside the run front-ends. The repair boundary runs LLM-generated solver code, which throws Error (TypeError, ParseError, unknown class) as readily as Exception — so generate/run/repair catch \Throwable to repair-and-resume. But a cancellation is also a Throwable and must never be 'repaired': each catches \Cancellation first and rethrows it, then \Throwable.
… pass draft is now critic-gated (#[Step(critic: 'solverReview')]) and returns the code; the separate review() step and its one-shot reviseCode() are gone. So a rejected draft RE-RUNS draft — continuing the same conversation (the model keeps its prior attempt) and reading the findings as guidance — and is RE-JUDGED each round. That closes the gap where the worker's fix was saved unreviewed (how a 'done in validate' solver slipped through). The solverReview rubric carries the same bar (does real work, no premature done, recipe genuinely carried out).
… subscribers by spl_object_id Make LiveTraceSink, HttpGateSpeaker, ConsoleRunFrontend, HttpRunFrontend and IssueRunner `final readonly class` (all their properties are immutable), dropping the per-property readonly. TraceBus drops its hand-rolled subscriber counter and keys subscribers by spl_object_id($channel) — the channel is its own identity.
…he typed record LiveTraceSink used to take a raw \PDO only to read lastInsertId(), reach past the TraceStore for the seq, and hand the bus a pre-formatted wire array — three smells. Now it COMPOSES the TraceStore: write() persists through it, then asks it for the seq (new TraceStore::lastSeq()) and publishes the typed TraceRecordInterface plus that seq. The bus carries [record, seq], not a wire array; the wire formatting moves to the SSE edge (Server::liveRow), where Server::trace() already owns it. So there is one persist-and-publish sink (no [TraceStore, LiveTraceSink] ordering to get right): the run front-end now returns the whole sink list (RunFrontendInterface ::traceSinks) — console = [TraceStore, ConsoleTraceSink], http = [LiveTraceSink]. Verified live: 49 streamed events == 49 db rows (persists once), seq consistent.
Add blank_line_before_statement (if/for/foreach/while/switch/do/try/return/throw/ break/continue/yield) to .php-cs-fixer.dist.php, so there is always a blank line after a block's closing } before the next statement. Apply it across the codebase (whitespace only).
…use handles The HTTP server opened a raw \PDO per request and ran raw SQL (issues/runs/trace/ artifacts), bypassing the data layer. Move those reads where they belong: ProjectStore gains all()/openByKey()/allIssues()/runsFor() (and busy_timeout in its opener), TraceReader gains tail()/tokens()/artifactRecords(). The Server now caches one read handle + TraceReader per project key (opened once, reused — no per-request connection churn) and only routes + assembles the wire shape; a run still gets its own fresh writable handle. Also drop the cryptic 'hb' SSE comment for the canonical empty-comment heartbeat. Verified live: /projects, /issues (status/done/tokens/runs/artifacts), trace and the run stream all return the same data; 49 streamed == 49 db rows.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.