## feat: optional session archive compaction (rag-doll)#63
Open
giveen wants to merge 19 commits into
Open
Conversation
Implements all 8 phases of the archive compaction plan from rag_md.txt: Phase 1 - Config feature gating: - Add ArchiveCompactionConfig struct with full validation - IsArchiveCompactionEnabled(), ArchiveCompactionSettings(), ArchiveCompactionDefaultsApplied(), ValidateArchiveCompaction() Phase 2 - Archive persistence (internal/archive): - SessionArchive, ArchiveChunk, ArchivedMessage types - Atomic write (temp→chmod 0600→rename), Load/Save/DeleteFiles - Reconstruct() for lossless active-history recovery Phase 3 - Compaction strategy: - Compact(): threshold check, keepRecent trim, dedup by hash, chunk accumulation, two-file atomic commit - ReconcileOnStartup(): hash-dedup merge of in-flight archives - Lock files with stale detection (ModTime + PID liveness) Phase 4 - Search engine: - Lazy in-memory keyword index, dirty-flag rebuild after compaction - Scoring: +10 exact match, +3/token content, +2 tool meta, +1 role - ReasoningContent deliberately excluded from index Phase 5 - Archive tools: - search_session_archive and retrieve_archived_message tools - ArchiveSubsystem, RegisterArchiveTools() - Safety header always prepended to retrieved content - Max 32 KiB payload, max 20 refs per call Phase 6 - Orchestrator pre-run hook: - runArchivePreHook() wired into Execute() and run() (fail-open) - Compacts history, persists result, updates session meta counters - Lazily registers archive tools into session registry Phase 7 - CLI bootstrap: - Log archive compaction config at startup when enabled Phase 8 - Observability & safety: - Permission check on archive file (warn if not 0600) - Search index warm latency logged - Session meta: CompactionCount, ArchivedMessageCount, LastCompactionAt Bug fix - Unix safe command whitelist: - Add whitelistedUnixCommands in ast_bridge.go (ls, grep, cat, etc.) - PolicyEngine.allCommandsAllowlisted: nil flag set = all flags OK - Fixes TestBashTool_RequiresConfirmation (all 23 sub-tests now pass) Tests: 40 archive, 14 tool, 8 config, 3 orchestrator hook tests added
- Redirect log output to ~/.local/share/late/late.log to prevent archive diagnostics from bleeding into the TUI - Add consecutive tool-call repetition guard (max 4) to RunLoop to abort infinite subagent loops early - Inject synthetic user notice into active history after compaction so the model knows to use search_session_archive - Add Session Archive section to both subagent system prompts (instruction-coding.md, instruction-planning.md) so agents know the archive tools exist and when to use them
When finish_reason=="length" (context window full), RunLoop now
calls an optional onContextOverflow callback before returning an error.
The BaseOrchestrator wires this to forceCompact(), which:
- Ignores the normal threshold and force-archives all but the
most recent keep_recent_messages messages
- Injects a synthetic notice so the model knows to use
search_session_archive to recover older context
- Updates o.sess.History and persists to disk
- Registers archive tools if not already registered
- Returns true to retry the failed turn
If archive compaction is disabled or fails, the original
'exceeds the available context size' error is still returned.
Keep the last 8 tool-call signatures in a sliding window. If any single signature appears 3+ times within that window the run loop is terminated with a diagnostic message. This catches alternating-pair and short-cycle loops that the existing consecutive-repeat guard misses (which only fires when the exact same signature repeats back-to-back 4+ times).
Add GetArchiveSubsystem() and GetArchiveSearchSettings() accessors to BaseOrchestrator so the parent's loaded archive can be passed to child agents without re-loading from disk. In NewSubagentOrchestrator, after the child orchestrator is built, if the parent is a *BaseOrchestrator with a live archive subsystem, register search_session_archive and retrieve_archived_message on the child's session registry pointing at the parent's archive. This means a spawned subagent can search the parent's compacted history to recover earlier decisions, file contents, or instructions that were archived before the subagent was spawned.
1. forceCompact: add missing ChunkSize to CompactionConfig.
Without it cfg.ChunkSize==0 caused an infinite loop in Compact's
'for start += ChunkSize' loop when context overflow recovery fired.
2. runArchivePreHook warmup: replace svc.Search("") no-op with a
proper svc.WarmUp() method that actually builds the index.
Empty-query guard in Search() meant the index was never built at
startup, making the 'search index ready in Xns' log meaningless.
3. ReconcileOnStartup: call archive.ReconcileOnStartup on first
archive load inside runArchivePreHook. Detects and removes
messages duplicated between the archive and active history —
which can occur after a crash between the two atomic renames
in Compact. Previously this function existed and had tests but
was never actually called.
4. planner subagent type: wire up instruction-planning.md as the
system prompt for a new 'planner' agent_type in
NewSubagentOrchestrator. spawn_subagent now accepts
enum=["coder","planner"]. Planner inherits the read-only tool
subset (isPlanning=true). The planning prompt was already
updated with session archive awareness in a prior commit.
1. session delete: call archive.DeleteFiles() so .archive.json and .archive.lock are cleaned up when a session is deleted. Previously these orphaned files accumulated silently. 2. ReconcileOnStartup: corrected the log messages to be accurate — duplicates are kept in active history (they are deduplicated on the next compaction pass, not removed immediately). The original 'removed N duplicate messages' log was incorrect. 3. session list -v: display archive stats (compaction count, archived message count, last compaction time) in verbose mode when the session has been compacted at least once. 4. forceCompact: reuse already-loaded archive and config from o.archiveSub when available, falling back to disk reads only when the archiveSub hasn't been initialised yet. Avoids redundant config and archive file reads on every context overflow recovery. 5. search case-sensitive toolMeta: store rawToolMeta (original casing) alongside the lowercased toolMeta in indexedEntry, and use it when caseSensitive=true. Previously case-sensitive search over tool call names and results always matched against lowercase text.
1. forceCompact: update ArchiveSubsystem in-place rather than replacing the pointer. Registered tools (search_session_archive, retrieve_archived_message) hold a pointer to *ArchiveSubsystem — replacing o.archiveSub with a new struct left the tools pointing at the stale pre-compaction archive, meaning any archive search immediately after emergency compaction would miss the newly archived messages entirely. 2. RunLoop: add maxOverflowRetries=3 cap on consecutive overflow/compact cycles. Previously, if the keep-recent messages were themselves too large for the context window, forceCompact kept returning true, i-- kept preventing the turn counter from advancing, and the loop ran forever. After 3 failed retries the loop now returns a descriptive error. 3. Compact: set ArchiveGeneration before the initial atomic write rather than in a second Save() call after both renames. Eliminates the crash window where the archive had new chunk IDs (stamped with newGeneration) but the archive_generation field on disk still showed the old value. The second Save call is removed. 4. search_session_archive: cap the model-supplied max_results at the configured maximum. Previously the model could request an unbounded number of results, bypassing the configured cap and potentially producing an oversized response payload.
1. runArchivePreHook: apply same in-place update fix as forceCompact (pass 3). Every invocation was replacing o.archiveSub with a new struct, causing the already-registered tools (search_session_archive, retrieve_archived_message) to hold a pointer to the first-ever ArchiveSubsystem. After the second threshold compaction fired via the pre-hook, all archive searches continued searching the archive from compaction #1 while the actual session archive had grown to compaction #2 and beyond. Also avoids replacing the search index on NoOp turns — the existing warmed index stays in place when nothing was compacted. 2. forceCompact: update session meta counters after emergency compaction so that 'late session list -v' archive stats (CompactionCount, ArchivedMessageCount, LastCompactionAt) reflect emergency compactions, not just pre-hook ones. 3. Compact: add defensive ChunkSize <= 0 guard. ChunkSize=0 caused an infinite loop in the chunk iteration (start += 0 never advances). Config defaults are applied at the call sites, but the function itself was unguarded.
1. forceCompact: check res.LockHeld before injecting compaction notice. When another process holds the archive lock, Compact returns LockHeld=true with NoOp=false — not caught by the existing 'res.NoOp' guard. Code was falling through: appending a false '[System] N messages were moved' notice to history (with N=0 since nothing was archived), saving that spurious history, and returning true to tell RunLoop to retry the turn. Each overflow retry cycle consumed one overflowRetries slot with nothing actually compacted. 2. runArchivePreHook: same LockHeld fix. When the lock was held, the '!res.NoOp' guard evaluated to true (LockHeld!=NoOp), so the pre-hook was injecting the compaction notice and saving history even though zero messages had been archived. Also prevents pointlessly calling MarkDirty and swapping the search index when the archive was not modified. 3. runArchivePreHook: use RLock/RUnlock instead of Lock/Unlock when reading o.archiveSub at the start of the function. Write lock was used for a read-only access, unnecessarily blocking concurrent readers.
1. Compact: track NextSequence in local var; do not mutate input *SessionArchive.
Previously 'archive.NextSequence++' mutated the passed-in pointer inside the
chunk loop. 'newArchive := *archive' was only made after the loop completed,
so if either os.Rename call failed, the caller's in-memory Archive.NextSequence
was already advanced past the value on disk. Any subsequent compaction run
would assign duplicate sequence numbers to new messages.
Fix: introduce 'nextSeq := archive.NextSequence', increment that local var,
and set 'newArchive.NextSequence = nextSeq' on the copy only.
2. Load sessionID: use archive.BaseSessionID(histPath) not o.id in both
forceCompact and runArchivePreHook. The second argument to archive.Load is
only used when the archive file does not exist yet and a fresh archive must
be created. Using o.id ('main' for the default orchestrator) stored a wrong
session_id ('main') in every newly-created archive file instead of the actual
session token (e.g. 'session-20250501-abc123').
3. forceCompact: add svc.WarmUp() after emergency compaction. runArchivePreHook
already called WarmUp; forceCompact was inconsistent — the first archive search
after an emergency compaction would always incur a cold index-build penalty.
4. forceCompact: add retrieve_archived_message to the compaction notice injected
into history. The pre-hook notice already mentioned both tools; forceCompact
only mentioned search_session_archive, leaving the model unaware it could
fetch a specific message by reference after an emergency compaction.
Messages with Role=user and Content starting with '[System]' are internal notices injected by the archive pre-hook and forceCompact to inform the model that history was compacted. There is no value in showing these to the user — they clutter the chat view with implementation details. Skip them during viewport rendering by appending an empty string to the render cache.
1. GenerateSessionMeta: skip [System] compaction notices for title/lastPrompt. After compaction the injected '[System] N messages were moved...' notice is the last user message in the active window. GenerateSessionMeta was picking it up as LastUserPrompt (and potentially as Title if it was the first user message in the kept window), corrupting 'late session list' output. Fix: skip user messages whose Content starts with '[System]' in both the forward (title) and backward (lastPrompt) scan loops. 2. UpdateSessionMetadata: preserve CreatedAt and archive counter fields. GenerateSessionMeta always returned CreatedAt=time.Now(), so every call to saveAndNotify (which is every message) overwrote the real session creation time with the current time. More critically, the archive counter fields (CompactionCount, ArchivedMessageCount, LastCompactionAt) set by the orchestrator's post-compaction block were being zeroed on the very next saveAndNotify call because GenerateSessionMeta has no access to those values. Fix: UpdateSessionMetadata now loads the existing on-disk meta and merges CreatedAt + archive counter fields before saving, so orchestrator-managed fields are preserved across message writes.
Adds 'late session prune' to clean up accumulated session files: late session prune --older-than <days> delete sessions last updated > N days ago late session prune --keep-last <n> keep only the N most recent sessions late session prune --dry-run preview what would be deleted Flags can be combined: --older-than runs first, then --keep-last trims whatever remains. Both the active history file and the associated archive / lock files are removed (via archive.DeleteFiles, same path as 'session delete'). Also updates quickstart doc with prune examples.
Other tools sharing the same sessions directory (e.g. sast-*) write .meta.json files with different naming conventions. ListSessions was picking them all up. Now only entries whose ID starts with 'session-' are included, matching the naming format Late uses when creating new sessions (session-YYYYMMDD-HHMMSS).
…age stub runArchivePreHook previously created a new SearchService and called WarmUp() unconditionally, including on every no-op pass (threshold not hit). On a no-op with firstInit=false the svc was never assigned, so the full archive re-index was thrown away. In a long session with hundreds of archived messages this caused a measurable full re-scan on every user message. Fix: move svc creation and WarmUp into the branches where svc is actually assigned: - firstInit=true → always build (first time tools are registered) - firstInit=false && real compaction → rebuild and assign - firstInit=false && no-op/lock-held → skip entirely Also remove the dead 'tool.RegisterArchiveTools' reference in main.go. Go does not require function references for linker retention; the comment was misleading and the expression had no effect.
Contributor
Author
|
Its a psuedo-RAG system, basically. |
There was a problem hiding this comment.
Pull request overview
Adds an opt-in session archive compaction subsystem that trims long-running sessions into an on-disk archive and exposes archive search/retrieval tools to agents (including subagents), plus related CLI/TUI/metadata updates.
Changes:
- Introduces
internal/archive(archive persistence, compaction, keyword search) and registers new toolssearch_session_archive/retrieve_archived_messagewhen enabled. - Hooks archive compaction into the orchestrator run loop (threshold-based pre-hook + emergency compaction on context overflow) and records compaction counters in session metadata/UI.
- Adds
late session pruneand expands subagent support with a newplannertype + updated prompts.
Reviewed changes
Copilot reviewed 35 out of 36 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| internal/tui/view.go | Hides system-injected compaction notices in the TUI history rendering. |
| internal/tool/subagent.go | Allows spawning planner subagents via tool schema update. |
| internal/tool/powershell_analyzer.go | Removes legacy PowerShell command analyzer. |
| internal/tool/permissions.go | Removes legacy allow-list normalization helper. |
| internal/tool/implementations.go | Switches shell analyzer selection to always use AST analyzer. |
| internal/tool/implementations_cmd_test.go | Updates command tests after analyzer refactor/removals. |
| internal/tool/bash_analyzer.go | Removes legacy Bash analyzer. |
| internal/tool/bash_analyzer_test.go | Removes legacy Bash analyzer tests. |
| internal/tool/bash_analyzer_project_test.go | Removes legacy Bash allow-list parsing tests. |
| internal/tool/ast/shadow.go | Removes AST “shadow mode” implementation. |
| internal/tool/ast/policy.go | Updates allow-list semantics (nil flag-set means “allow all flags”). |
| internal/tool/ast/feature_flag.go | Removes AST rollout feature flags (shadow/enforcement). |
| internal/tool/ast_mode_test.go | Updates AST-mode tests to reflect always-AST behavior. |
| internal/tool/ast_bridge.go | Seeds built-in safe command allow-list into AST policy engine. |
| internal/tool/archive_tools.go | Adds archive search/retrieval tools and registration helper. |
| internal/tool/archive_tools_test.go | Adds unit tests for archive tools (availability, payload caps, safety header). |
| internal/tool/allowlist_parse.go | Adds new allow-list key/flag extraction using shell AST parsing. |
| internal/tool/allowlist_parse_test.go | Adds tests for allow-list parsing behavior. |
| internal/session/ttystyle.go | Displays archive compaction metadata in verbose session output. |
| internal/session/session.go | Skips compaction notices when generating session title/last prompt; preserves archive counters when updating meta. |
| internal/session/models.go | Adds compaction fields to SessionMeta; filters ListSessions to session- IDs only. |
| internal/orchestrator/base.go | Adds archive pre-hook + emergency compaction and registers archive tools; exposes archive subsystem to subagents. |
| internal/orchestrator/base_archive_test.go | Adds tests for archive pre-hook enable/disable and tool registration behavior. |
| internal/executor/executor.go | Adds context overflow retry hook + tool-call cycle detection; fixes indentation in tool-call execution gate. |
| internal/config/config.go | Adds archive compaction config block with defaults + validation helpers. |
| internal/config/config_test.go | Adds config tests for archive compaction defaults, parsing, and validation. |
| internal/assets/prompts/instruction-planning.md | Documents archive tools availability for planner subagents. |
| internal/assets/prompts/instruction-coding.md | Documents archive tools availability for coder subagents. |
| internal/archive/search.go | Implements in-memory keyword search index + scoring/sorting. |
| internal/archive/compaction.go | Implements compaction, locking, deduplication, atomic writes, and startup reconciliation. |
| internal/archive/archive.go | Implements archive schema, load/save, helpers, and reconstruction utilities. |
| internal/archive/archive_test.go | Adds extensive tests for archive persistence, compaction, locking, and search. |
| internal/agent/agent.go | Implements planner subagent prompts/toolsets; inherits parent archive tools. |
| docs/quickstart.md | Documents session pruning and archive compaction configuration. |
| cmd/late/main.go | Adds session prune command; deletes archive files on session delete; redirects logs to file. |
| .gitignore | Ignores rag_md.txt. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Use SystemNotice flag on ChatMessage instead of [System] string prefix
for filtering compaction notices in tui/view.go and session/session.go
- Remove 'find' and 'env' from Unix command whitelist in ast_bridge.go
('find' allows dangerous -exec flag; 'env' wraps arbitrary commands)
- Fix Windows whitelisted command seeding to use nil flag set (allow all
flags) instead of empty map[string]bool{} which denied everything
- Move processAlive to platform-specific files: process_unix.go uses
kill(pid,0) via syscall; process_windows.go stubs to true (relies on
StaleAfterSeconds for recovery)
- Replace O(n²) insertion sort in search.go with sort.Slice (O(n log n))
- Hash toolCallSig arguments with SHA-256 to avoid large allocations for
tools with kilobyte-scale payloads (write_file, etc.)
- Use pointer map for ArchivedMessage lookup in archive_tools.go to avoid
copying large Message.Content on every retrieval call
Owner
|
I will go over this after #62 is merged |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
feat: optional session archive compaction (rag-doll)
Adds a rolling session archive system that automatically moves older messages out of the active context window into a compressed on-disk archive, with keyword search and stable message retrieval. Fully opt-in — zero behaviour change without the config block.
What this does
When a session grows beyond a configured message threshold, Late moves the oldest messages into a
.archive.jsonfile sitting alongside the session history. The active history is trimmed to a recent window. The model is notified and given two new tools to work with historical context:search_session_archive— keyword search over all archived messages, returns ranked results with stable reference handlesretrieve_archived_message— fetches full archived messages by reference handleIf the context window overflows mid-turn (the API returns a context-length error), an emergency compaction fires automatically regardless of the threshold, retries the turn, and keeps the session alive.
Subagents inherit the parent orchestrator's archive so they can search the full session history too.
Enabling it
Add to
~/.config/late/config.json(Linux) /~/Library/Application Support/late/config.json(macOS):Recommended presets:
New CLI commands
What is NOT changed
archive_compaction.enabled: truebehave exactly as beforeCorrectness work
Seven audit passes were done after the initial implementation. Notable fixes:
Compact(): sequence counter was mutating the caller's*SessionArchivein-place, causing duplicate sequence numbers on subsequent compactionsLoad()call sites were using the orchestrator ID ("main") instead of the actual session ID derived from the history file path — new archives were being created withsession_id: "main"forceCompactandrunArchivePreHook: both injected compaction notices and saved history even when the lock was held by another process (res.LockHeldnot checked)UpdateSessionMetadata: archive counters (CompactionCount,ArchivedMessageCount,LastCompactionAt) were being zeroed on everysaveAndNotifycall becauseGenerateSessionMetahas no access to those values;CreatedAtwas also reset totime.Now()on every messageGenerateSessionMeta:[System]compaction notices were being picked up as session title and last prompt insession listoutputListSessionswas picking up.meta.jsonfiles from other tools sharing the sessions directory; now filters tosession-prefix only[System]compaction notices in the chat view; now hiddenContributor License Agreement (CLA)
To accept your code, we legally need you to agree to our CLA so we can maintain the project's Business Source License (BSL) and future open-source transitions.
xbetween the brackets like this:[x])