Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
43 changes: 43 additions & 0 deletions BUGS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
# Bugs

These are bugs (or missing features) I've observed while working with `multi checks`.

- [ ] No use of Cersei workflows to chain multiple prompts together.

- [ ] No support for Fireworks AI.

- [ ] No GitHub Action available.

- [ ] No loading of skills.

- [ ] No loading of RULES.md files from the .claude directory.

- [ ] Assemble_instructions is hard-coded: src/checks/executor/mod.rs:98 (definition), called from src/checks/executor/cersei.rs:110

- [ ] No system prompt provided.

- [ ] Not sure if prompt caching is enabled at all.

- [ ] No trace capture. We need a way to record all session traces so that we can analyze why they failed.

- CERSEI: `append_system_prompt()` function is dead unless routed through the separate build_system_prompt() composer.

## Fixes

- [x] No loading of CLAUDE.md files

- [x] Concurrency still not respected.

- [x] Full error text got cut off at the end of the terminal screen instead of wrapping. Turned out to
live in the *presenter* (`src/checks/presenter/inline.rs`), not the `Reporter` — the inline TUI is the
sole terminal writer for the whole run (see `owns_record` in `src/checks/mod.rs`), so `Reporter::report()`
never even runs in a TTY session. The presenter renders into a fixed-size `ratatui::Buffer` via
`insert_before`, which clips instead of wrapping; fixed by word-wrapping every flushed line to the
terminal width before building it.

- [x] A `tracing::info!`/`debug!` log line fired mid-run (e.g. the "retrying check whose agent did not
report" line) wrote raw bytes straight to stdout, corrupting the inline TUI's cursor-managed viewport.
Fixed by giving the presenter full ownership of log output: `PresenterActor` now registers itself as
`tracing`'s active sink (`src/terminal/logging.rs::route_logs`) for the run's duration and folds each
line in as a `UiEvent::Log`. The inline TUI flushes each line to permanent scrollback (same mechanism as
a completed requirement) and also keeps the last few in a small live pane, separate from the tree.
9 changes: 9 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 2 additions & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -84,8 +84,10 @@ comrak = { version = "0.52", default-features = false }
libc = "0.2.186"
tempfile = "3"
cersei-agent = { git = "https://github.com/wack/cersei.git", branch = "trunk" }
cersei-memory = { git = "https://github.com/wack/cersei.git", branch = "trunk" }
cersei-tools = { git = "https://github.com/wack/cersei.git", branch = "trunk" }
cersei-types = { git = "https://github.com/wack/cersei.git", branch = "trunk" }
textwrap = "0.16"

[workspace.dependencies]
pretty_assertions = "1.4"
Expand Down
33 changes: 31 additions & 2 deletions src/checks/executor/cersei.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,12 @@
//! `cersei_agent::Agent` in its CoW sandbox, capturing the verdict through a
//! per-check judge tool — no `claude -p` subprocess, no MCP endpoints.

use std::path::Path;
use std::time::Duration;

use async_trait::async_trait;
use cersei_agent::Agent;
use cersei_memory::claudemd;
use cersei_tools::permissions::AllowReadOnly;
use cersei_tools::{Tool, clear_session_shell_state};
use cersei_types::CerseiError;
Expand Down Expand Up @@ -64,6 +66,21 @@ fn read_only_tools() -> Vec<Box<dyn Tool>> {
]
}

/// Load the project's `CLAUDE.md` hierarchy (managed `~/.claude/rules/*.md`,
/// user `~/.claude/CLAUDE.md`, project `{root}/CLAUDE.md`, local
/// `{root}/.claude/CLAUDE.md`, with `@include` expansion) as a single system
/// prompt string, or `None` if no instruction files were found.
///
/// Calls `claudemd` directly rather than going through
/// `cersei_memory::manager::MemoryManager` so this stays a stateless
/// filesystem read — no session storage, no graph memory, no multi-session
/// persistence gets pulled in as a side effect.
fn project_instructions(project_root: &Path) -> Option<String> {
let files = claudemd::load_all_memory_files(project_root);
let prompt = claudemd::build_memory_prompt(&files);
(!prompt.trim().is_empty()).then_some(prompt)
}

/// Map our coarse [`Effort`] onto a sampling temperature.
///
/// Extended thinking would be the natural effort vehicle, but cersei-provider
Expand Down Expand Up @@ -109,7 +126,7 @@ impl CheckExecutor for CerseiExecutor {

let instructions = assemble_instructions(&req.check, &judge_tool_directive());

let agent = Agent::builder()
let mut agent_builder = Agent::builder()
.provider_boxed(provider)
.model(self.model.clone())
.working_dir(req.working_dir.clone())
Expand All @@ -122,7 +139,19 @@ impl CheckExecutor for CerseiExecutor {
// Thinking is intentionally left disabled (see `effort_temperature`).
.temperature(effort_temperature(self.effort))
.max_turns(MAX_TURNS)
.cancel_token(cancel.clone())
.cancel_token(cancel.clone());

// `.system_prompt()`, not `.append_system_prompt()`: cersei's agent
// runner only ever reads `Agent.system_prompt` when building each
// completion request (`append_system_prompt` is exclusively consumed
// by the separate `cersei_agent::system_prompt::build_system_prompt`
// composer, which this executor doesn't use), and we don't set a base
// system prompt anywhere else here.
if let Some(project_prompt) = project_instructions(&req.working_dir) {
agent_builder = agent_builder.system_prompt(project_prompt);
}

let agent = agent_builder
.build()
.map_err(|e| miette!("building check agent: {e}"))?;

Expand Down
14 changes: 12 additions & 2 deletions src/checks/presenter/heartbeat.rs
Original file line number Diff line number Diff line change
Expand Up @@ -73,8 +73,18 @@ impl HeartbeatBackend {
}

impl RenderBackend for HeartbeatBackend {
fn apply(&mut self, _state: &PresenterState, _event: &UiEvent) {
// Heartbeat is purely time-driven; events only update shared state.
fn apply(&mut self, _state: &PresenterState, event: &UiEvent) {
// Heartbeat is otherwise purely time-driven; events only update shared
// state. Routed log lines are the exception: the presenter is now
// `tracing`'s sole sink for the run (see the module docs), so if we
// don't re-emit them here they simply vanish. Stderr, not stdout, to
// honor this backend's own invariant that stdout stays reserved for the
// reporting actor's byte-for-byte report.
if let UiEvent::Log(line) = event {
let mut err = std::io::stderr().lock();
let _ = writeln!(err, "{line}");
let _ = err.flush();
}
}

fn tick(&mut self, state: &PresenterState) {
Expand Down
Loading