Problem
Agent::estimate_tokens() calls message_char_count() which uses serde_json::to_string() on tool input Value objects to measure their character length. This re-serialization happens on every iteration of the chat loop, before each provider call.
Impact
For conversations with many tool calls (each containing JSON arguments and results), this creates unnecessary allocation and serialization overhead that grows with conversation length.
Proposed Fix
Options (in order of preference):
- Cache the char count per message: Store
cached_char_count: Option<usize> on each message. Only compute once; messages are immutable after creation.
- Track running total: Maintain a running character count that gets incremented when messages are added. Reset on compaction.
- Use
serde_json::to_writer with a counting writer: Avoids allocating the string entirely — just counts bytes.
Option 2 is the simplest and most efficient.
Related
PR #261 — Wire config options, context compaction, and auto-index on startup
Problem
Agent::estimate_tokens()callsmessage_char_count()which usesserde_json::to_string()on tool inputValueobjects to measure their character length. This re-serialization happens on every iteration of the chat loop, before each provider call.Impact
For conversations with many tool calls (each containing JSON arguments and results), this creates unnecessary allocation and serialization overhead that grows with conversation length.
Proposed Fix
Options (in order of preference):
cached_char_count: Option<usize>on each message. Only compute once; messages are immutable after creation.serde_json::to_writerwith a counting writer: Avoids allocating the string entirely — just counts bytes.Option 2 is the simplest and most efficient.
Related
PR #261 — Wire config options, context compaction, and auto-index on startup