feat: session token breakdown in footer and /status#2152
Conversation
Add accumulated session token tracking with input / cache-hit / output breakdown. Rebased from PR Hmbown#1666 onto post-rebrand main (v0.8.45). Changes: - SessionState: new total_input_tokens, total_cache_hit_tokens, total_cache_miss_tokens, total_output_tokens fields - Turn outcome handler: accumulate per-turn token breakdown - StatusItem::Tokens: new footer chip, enabled by default - Footer chip: "12K in · 8.1K cch · 2.5K out" format - /status: expanded with session input/cache/output rows - /clear and /load: reset accumulated breakdown
There was a problem hiding this comment.
Code Review
This pull request introduces tracking and displaying of accumulated session token usage (input, cache hit, cache miss, and output tokens) across the TUI application. Feedback focuses on refactoring the repeated token-resetting logic into a helper method on SessionState to keep the code DRY. Additionally, improvements are suggested to handle models/providers that do not support cache telemetry, ensuring the UI (footer and status view) does not display misleading zero-values or redundant cache metrics when telemetry is unavailable.
| pub total_input_tokens: u32, | ||
| pub total_cache_hit_tokens: u32, | ||
| pub total_cache_miss_tokens: u32, | ||
| pub total_output_tokens: u32, |
There was a problem hiding this comment.
To avoid duplicating the resetting of the accumulated token breakdown fields across multiple commands and UI event loops, consider adding a helper method on SessionState.
For example:
impl SessionState {
/// Reset the accumulated token breakdown fields to zero.
pub fn reset_token_breakdown(&mut self) {
self.total_input_tokens = 0;
self.total_cache_hit_tokens = 0;
self.total_cache_miss_tokens = 0;
self.total_output_tokens = 0;
}
}| app.session.total_input_tokens = 0; | ||
| app.session.total_cache_hit_tokens = 0; | ||
| app.session.total_cache_miss_tokens = 0; | ||
| app.session.total_output_tokens = 0; |
There was a problem hiding this comment.
Use the newly introduced reset_token_breakdown helper method on SessionState to reduce boilerplate and keep the code DRY.
| app.session.total_input_tokens = 0; | |
| app.session.total_cache_hit_tokens = 0; | |
| app.session.total_cache_miss_tokens = 0; | |
| app.session.total_output_tokens = 0; | |
| app.session.reset_token_breakdown(); |
| app.session.total_input_tokens = 0; | ||
| app.session.total_cache_hit_tokens = 0; | ||
| app.session.total_cache_miss_tokens = 0; | ||
| app.session.total_output_tokens = 0; |
There was a problem hiding this comment.
Use the reset_token_breakdown helper method on SessionState here as well to simplify the session loading logic.
| app.session.total_input_tokens = 0; | |
| app.session.total_cache_hit_tokens = 0; | |
| app.session.total_cache_miss_tokens = 0; | |
| app.session.total_output_tokens = 0; | |
| app.session.reset_token_breakdown(); |
| app.session.total_input_tokens = 0; | ||
| app.session.total_cache_hit_tokens = 0; | ||
| app.session.total_cache_miss_tokens = 0; | ||
| app.session.total_output_tokens = 0; |
| app.session.total_cache_hit_tokens = app | ||
| .session | ||
| .total_cache_hit_tokens | ||
| .saturating_add(usage.prompt_cache_hit_tokens.unwrap_or(0)); | ||
| let cache_miss = usage.prompt_cache_miss_tokens.unwrap_or_else(|| { | ||
| usage | ||
| .input_tokens | ||
| .saturating_sub(usage.prompt_cache_hit_tokens.unwrap_or(0)) | ||
| }); | ||
| app.session.total_cache_miss_tokens = app | ||
| .session | ||
| .total_cache_miss_tokens | ||
| .saturating_add(cache_miss); |
There was a problem hiding this comment.
If the active model or provider does not support cache telemetry, prompt_cache_hit_tokens will be None. In this case, defaulting to 0 and subtracting it from input_tokens falsely attributes all input tokens as cache misses. This leads to misleading metrics (e.g., showing 100% cache misses for unsupported models).
Only accumulate cache hit and miss tokens if prompt_cache_hit_tokens is actually reported (Some).
if let Some(hit_tokens) = usage.prompt_cache_hit_tokens {
app.session.total_cache_hit_tokens = app
.session
.total_cache_hit_tokens
.saturating_add(hit_tokens);
let cache_miss = usage.prompt_cache_miss_tokens.unwrap_or_else(|| {
usage.input_tokens.saturating_sub(hit_tokens)
});
app.session.total_cache_miss_tokens = app
.session
.total_cache_miss_tokens
.saturating_add(cache_miss);
}| if session.total_input_tokens == 0 && session.total_output_tokens == 0 { | ||
| return Vec::new(); | ||
| } | ||
| let in_str = format_token_count_compact(u64::from(session.total_input_tokens)); | ||
| let cache_str = format_token_count_compact(u64::from(session.total_cache_hit_tokens)); | ||
| let out_str = format_token_count_compact(u64::from(session.total_output_tokens)); | ||
| vec![Span::styled( | ||
| format!("{in_str} in · {cache_str} cch · {out_str} out"), | ||
| Style::default().fg(palette::TEXT_MUTED), | ||
| )] |
There was a problem hiding this comment.
If the active model does not support cache telemetry, total_cache_hit_tokens and total_cache_miss_tokens will both be 0. Displaying 0 cch in the footer is redundant and clutters the UI for unsupported models. Consider omitting the cch chip entirely when no cache telemetry has been recorded.
if session.total_input_tokens == 0 && session.total_output_tokens == 0 {
return Vec::new();
}
let in_str = format_token_count_compact(u64::from(session.total_input_tokens));
let out_str = format_token_count_compact(u64::from(session.total_output_tokens));
let text = if session.total_cache_hit_tokens == 0 && session.total_cache_miss_tokens == 0 {
format!("{in_str} in · {out_str} out")
} else {
let cache_str = format_token_count_compact(u64::from(session.total_cache_hit_tokens));
format!("{in_str} in · {cache_str} cch · {out_str} out")
};
vec![Span::styled(
text,
Style::default().fg(palette::TEXT_MUTED),
)]| push_row( | ||
| &mut out, | ||
| "Session cache:", | ||
| &format!( | ||
| "{} hit / {} miss", | ||
| app.session.total_cache_hit_tokens, app.session.total_cache_miss_tokens | ||
| ), | ||
| ); |
There was a problem hiding this comment.
For models/providers that do not support cache telemetry, showing 0 hit / 0 miss is less clear than explicitly stating not reported (similar to how cache_summary handles it). Consider showing not reported when both hit and miss counts are zero.
| push_row( | |
| &mut out, | |
| "Session cache:", | |
| &format!( | |
| "{} hit / {} miss", | |
| app.session.total_cache_hit_tokens, app.session.total_cache_miss_tokens | |
| ), | |
| ); | |
| let session_cache = if app.session.total_cache_hit_tokens == 0 && app.session.total_cache_miss_tokens == 0 { | |
| "not reported".to_string() | |
| } else { | |
| format!( | |
| "{} hit / {} miss", | |
| app.session.total_cache_hit_tokens, app.session.total_cache_miss_tokens | |
| ) | |
| }; | |
| push_row( | |
| &mut out, | |
| "Session cache:", | |
| &session_cache, | |
| ); |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6cde1c74f3
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| // Accumulated token breakdown is per-runtime-session; reset on load. | ||
| app.session.total_input_tokens = 0; | ||
| app.session.total_cache_hit_tokens = 0; | ||
| app.session.total_cache_miss_tokens = 0; | ||
| app.session.total_output_tokens = 0; | ||
| app.session.turn_cache_history.clear(); |
There was a problem hiding this comment.
Restore current session ID when applying a loaded session
apply_loaded_session no longer assigns app.current_session_id, so after resuming/loading through this path the app keeps None or a stale previous ID. That breaks session scoping: startup queue restore compares against current_session_id, and later saves/syncs use this field to decide which session to update, so loading session B after working in session A can cause updates to be written to A (or a new ID) instead of B.
Useful? React with 👍 / 👎.
…Y helper - Restore app.current_session_id assignment accidentally dropped in apply_loaded_session during rebase (P1: breaks startup-resume and session-sync paths) - Guard cache-hit/miss accumulation behind is_some() so providers that omit cache telemetry don't inflate miss totals - Extract SessionState::reset_token_breakdown() to avoid duplicating the four-field reset in core/session/ui call sites - Hide the "cch" segment from the footer token chip when no cache data has been recorded - Show "not reported" in /status session-cache row instead of "0 hit / 0 miss" when no cache telemetry is available
Summary
Re-submit of #1666, rebased onto current
main(v0.8.45) after the CodeWhale rebrand.Add accumulated session token tracking with input / cache-hit / output breakdown:
total_input_tokens,total_cache_hit_tokens,total_cache_miss_tokens,total_output_tokensfieldsusagestruct12K in · 8.1K cch · 2.5K outVerified
cargo fmt --check✓cargo check✓cargo clippy✓ (no new warnings)cargo test --workspace --all-features— 3329 passed; 2 pre-existing failures unrelated to this changeGreptile Summary
This PR adds accumulated session token tracking (input / cache-hit / cache-miss / output) to the TUI footer chip and
/statuscommand. It is a rebase of #1666 onto the CodeWhale-brandedmain.SessionStategains fouru32accumulator fields and areset_token_breakdownhelper that is called consistently from/clear,/load, andapply_loaded_session, keeping all token-reset paths in sync.TurnOutcomehandler accumulates per-turn input and output tokens unconditionally; cache telemetry is gated onprompt_cache_hit_tokensbeingSome, which avoids double-counting when the provider doesn't report cache data.StatusItem::Tokens, enabled by default) renders a compactN in · N cch · N outlabel; the chip is hidden until the first turn completes. The/statuscommand gets three new rows showing the same breakdown in raw counts.Confidence Score: 5/5
Safe to merge; all accumulation, reset, and display paths are correctly wired and symmetric.
The accumulation logic is straightforward: four counters incremented per turn, reset in all three load/clear paths, displayed in the footer chip and
/status. The cache-telemetry guard (if let Some(hit_tokens)) correctly prevents double-counting when the provider omits cache fields. Config enum and UI mappings are complete with no missing arms. The only gaps are missing test assertions for the new/statusrows.The
status_report_includes_runtime_fieldstest incrates/tui/src/commands/status.rswould benefit from asserting the new Session input/cache/output rows.Important Files Changed
u32accumulator fields and areset_token_breakdownhelper toSessionState; defaults are zero and theDefaultimpl is updated correctly.TurnOutcomehandler; cache telemetry is gated onprompt_cache_hit_tokensbeingSome, avoiding the double-count bug. Also callsreset_token_breakdowninsideapply_loaded_session.footer_session_tokens_spansand wires it into theTokenschip slot; chip is hidden when both input and output are zero, consistent with other optional chips./statusoutput (Session input/cache/output); existing test does not assert the new rows, so a regression there would go undetected.reset_token_breakdownin the/clearhandler alongside the existing token-counter resets; correctly symmetric with the load and session-select paths.reset_token_breakdownin the/loadhandler; comment correctly explains that the breakdown is per-runtime-session, not persisted.StatusItem::Tokensto the enum and all five required match arms (key,label,description,defaults,all); no arms are missing.StatusItem::TokensintoStatusItemValuewith bothFromimpls updated symmetrically.Sequence Diagram
sequenceDiagram participant Engine participant ui.rs as run_event_loop participant SessionState participant FooterUI as footer_ui.rs participant StatusCmd as /status Engine->>ui.rs: "TurnOutcome { usage }" ui.rs->>SessionState: "total_input_tokens += usage.input_tokens" ui.rs->>SessionState: "total_output_tokens += usage.output_tokens" alt prompt_cache_hit_tokens is Some(hit) ui.rs->>SessionState: "total_cache_hit_tokens += hit" ui.rs->>SessionState: "total_cache_miss_tokens += miss (or input-hit)" end Note over ui.rs,SessionState: /clear or /load or apply_loaded_session ui.rs->>SessionState: reset_token_breakdown() SessionState-->>FooterUI: total_input/cache_hit/output_tokens FooterUI-->>FooterUI: render N in · N cch · N out chip SessionState-->>StatusCmd: total_input/cache_hit/cache_miss/output_tokens StatusCmd-->>StatusCmd: render Session input/cache/output rowsReviews (2): Last reviewed commit: "fix: address review feedback — current_s..." | Re-trigger Greptile