Skip to content

## feat: optional session archive compaction (rag-doll)#63

Open
giveen wants to merge 19 commits into
mlhher:mainfrom
giveen:rag-doll
Open

## feat: optional session archive compaction (rag-doll)#63
giveen wants to merge 19 commits into
mlhher:mainfrom
giveen:rag-doll

Conversation

@giveen
Copy link
Copy Markdown
Contributor

@giveen giveen commented May 6, 2026

⚠️ ONLY AFTER PR 62

feat: optional session archive compaction (rag-doll)

Adds a rolling session archive system that automatically moves older messages out of the active context window into a compressed on-disk archive, with keyword search and stable message retrieval. Fully opt-in — zero behaviour change without the config block.


What this does

When a session grows beyond a configured message threshold, Late moves the oldest messages into a .archive.json file sitting alongside the session history. The active history is trimmed to a recent window. The model is notified and given two new tools to work with historical context:

  • search_session_archive — keyword search over all archived messages, returns ranked results with stable reference handles
  • retrieve_archived_message — fetches full archived messages by reference handle

If the context window overflows mid-turn (the API returns a context-length error), an emergency compaction fires automatically regardless of the threshold, retries the turn, and keeps the session alive.

Subagents inherit the parent orchestrator's archive so they can search the full session history too.


Enabling it

Add to ~/.config/late/config.json (Linux) / ~/Library/Application Support/late/config.json (macOS):

"archive_compaction": {
  "enabled": true,
  "compaction_threshold_messages": 100,
  "keep_recent_messages": 20
}

Recommended presets:

Context window threshold keep_recent chunk_size max_results
64k 80 20 40 15
128k 160 30 60 20

New CLI commands

# Clean up old sessions
late session prune --older-than 30          # delete sessions not touched in 30+ days
late session prune --keep-last 20           # keep only the 20 most recent
late session prune --older-than 14 --keep-last 10 --dry-run  # preview

What is NOT changed

  • Sessions without archive_compaction.enabled: true behave exactly as before
  • No embeddings, no vector DB, no external dependencies — pure stdlib keyword search
  • Archive tools are not registered and not visible to the model when disabled

Correctness work

Seven audit passes were done after the initial implementation. Notable fixes:

  • Compact(): sequence counter was mutating the caller's *SessionArchive in-place, causing duplicate sequence numbers on subsequent compactions
  • Both Load() call sites were using the orchestrator ID ("main") instead of the actual session ID derived from the history file path — new archives were being created with session_id: "main"
  • forceCompact and runArchivePreHook: both injected compaction notices and saved history even when the lock was held by another process (res.LockHeld not checked)
  • UpdateSessionMetadata: archive counters (CompactionCount, ArchivedMessageCount, LastCompactionAt) were being zeroed on every saveAndNotify call because GenerateSessionMeta has no access to those values; CreatedAt was also reset to time.Now() on every message
  • GenerateSessionMeta: [System] compaction notices were being picked up as session title and last prompt in session list output
  • Search index was being rebuilt on every pre-hook call, including no-ops — now only rebuilds when compaction actually ran
  • ListSessions was picking up .meta.json files from other tools sharing the sessions directory; now filters to session- prefix only
  • TUI was rendering [System] compaction notices in the chat view; now hidden

Contributor License Agreement (CLA)

To accept your code, we legally need you to agree to our CLA so we can maintain the project's Business Source License (BSL) and future open-source transitions.

  • By checking this box, I confirm that I have read and agree to the terms of the CLA.md in this repository. (To check the box, put an x between the brackets like this: [x])

giveen added 18 commits May 6, 2026 15:10
Implements all 8 phases of the archive compaction plan from rag_md.txt:

Phase 1 - Config feature gating:
- Add ArchiveCompactionConfig struct with full validation
- IsArchiveCompactionEnabled(), ArchiveCompactionSettings(),
  ArchiveCompactionDefaultsApplied(), ValidateArchiveCompaction()

Phase 2 - Archive persistence (internal/archive):
- SessionArchive, ArchiveChunk, ArchivedMessage types
- Atomic write (temp→chmod 0600→rename), Load/Save/DeleteFiles
- Reconstruct() for lossless active-history recovery

Phase 3 - Compaction strategy:
- Compact(): threshold check, keepRecent trim, dedup by hash,
  chunk accumulation, two-file atomic commit
- ReconcileOnStartup(): hash-dedup merge of in-flight archives
- Lock files with stale detection (ModTime + PID liveness)

Phase 4 - Search engine:
- Lazy in-memory keyword index, dirty-flag rebuild after compaction
- Scoring: +10 exact match, +3/token content, +2 tool meta, +1 role
- ReasoningContent deliberately excluded from index

Phase 5 - Archive tools:
- search_session_archive and retrieve_archived_message tools
- ArchiveSubsystem, RegisterArchiveTools()
- Safety header always prepended to retrieved content
- Max 32 KiB payload, max 20 refs per call

Phase 6 - Orchestrator pre-run hook:
- runArchivePreHook() wired into Execute() and run() (fail-open)
- Compacts history, persists result, updates session meta counters
- Lazily registers archive tools into session registry

Phase 7 - CLI bootstrap:
- Log archive compaction config at startup when enabled

Phase 8 - Observability & safety:
- Permission check on archive file (warn if not 0600)
- Search index warm latency logged
- Session meta: CompactionCount, ArchivedMessageCount, LastCompactionAt

Bug fix - Unix safe command whitelist:
- Add whitelistedUnixCommands in ast_bridge.go (ls, grep, cat, etc.)
- PolicyEngine.allCommandsAllowlisted: nil flag set = all flags OK
- Fixes TestBashTool_RequiresConfirmation (all 23 sub-tests now pass)

Tests: 40 archive, 14 tool, 8 config, 3 orchestrator hook tests added
- Redirect log output to ~/.local/share/late/late.log to prevent
  archive diagnostics from bleeding into the TUI
- Add consecutive tool-call repetition guard (max 4) to RunLoop
  to abort infinite subagent loops early
- Inject synthetic user notice into active history after compaction
  so the model knows to use search_session_archive
- Add Session Archive section to both subagent system prompts
  (instruction-coding.md, instruction-planning.md) so agents know
  the archive tools exist and when to use them
When finish_reason=="length" (context window full), RunLoop now
calls an optional onContextOverflow callback before returning an error.
The BaseOrchestrator wires this to forceCompact(), which:
  - Ignores the normal threshold and force-archives all but the
    most recent keep_recent_messages messages
  - Injects a synthetic notice so the model knows to use
    search_session_archive to recover older context
  - Updates o.sess.History and persists to disk
  - Registers archive tools if not already registered
  - Returns true to retry the failed turn

If archive compaction is disabled or fails, the original
'exceeds the available context size' error is still returned.
Keep the last 8 tool-call signatures in a sliding window.
If any single signature appears 3+ times within that window the
run loop is terminated with a diagnostic message.

This catches alternating-pair and short-cycle loops that the
existing consecutive-repeat guard misses (which only fires when
the exact same signature repeats back-to-back 4+ times).
Add GetArchiveSubsystem() and GetArchiveSearchSettings() accessors
to BaseOrchestrator so the parent's loaded archive can be passed to
child agents without re-loading from disk.

In NewSubagentOrchestrator, after the child orchestrator is built,
if the parent is a *BaseOrchestrator with a live archive subsystem,
register search_session_archive and retrieve_archived_message on the
child's session registry pointing at the parent's archive.

This means a spawned subagent can search the parent's compacted
history to recover earlier decisions, file contents, or instructions
that were archived before the subagent was spawned.
1. forceCompact: add missing ChunkSize to CompactionConfig.
   Without it cfg.ChunkSize==0 caused an infinite loop in Compact's
   'for start += ChunkSize' loop when context overflow recovery fired.

2. runArchivePreHook warmup: replace svc.Search("") no-op with a
   proper svc.WarmUp() method that actually builds the index.
   Empty-query guard in Search() meant the index was never built at
   startup, making the 'search index ready in Xns' log meaningless.

3. ReconcileOnStartup: call archive.ReconcileOnStartup on first
   archive load inside runArchivePreHook. Detects and removes
   messages duplicated between the archive and active history —
   which can occur after a crash between the two atomic renames
   in Compact. Previously this function existed and had tests but
   was never actually called.

4. planner subagent type: wire up instruction-planning.md as the
   system prompt for a new 'planner' agent_type in
   NewSubagentOrchestrator. spawn_subagent now accepts
   enum=["coder","planner"]. Planner inherits the read-only tool
   subset (isPlanning=true). The planning prompt was already
   updated with session archive awareness in a prior commit.
1. session delete: call archive.DeleteFiles() so .archive.json and
   .archive.lock are cleaned up when a session is deleted. Previously
   these orphaned files accumulated silently.

2. ReconcileOnStartup: corrected the log messages to be accurate —
   duplicates are kept in active history (they are deduplicated on the
   next compaction pass, not removed immediately). The original 'removed
   N duplicate messages' log was incorrect.

3. session list -v: display archive stats (compaction count, archived
   message count, last compaction time) in verbose mode when the session
   has been compacted at least once.

4. forceCompact: reuse already-loaded archive and config from
   o.archiveSub when available, falling back to disk reads only when
   the archiveSub hasn't been initialised yet. Avoids redundant config
   and archive file reads on every context overflow recovery.

5. search case-sensitive toolMeta: store rawToolMeta (original casing)
   alongside the lowercased toolMeta in indexedEntry, and use it when
   caseSensitive=true. Previously case-sensitive search over tool call
   names and results always matched against lowercase text.
1. forceCompact: update ArchiveSubsystem in-place rather than replacing the
   pointer. Registered tools (search_session_archive, retrieve_archived_message)
   hold a pointer to *ArchiveSubsystem — replacing o.archiveSub with a new
   struct left the tools pointing at the stale pre-compaction archive, meaning
   any archive search immediately after emergency compaction would miss the
   newly archived messages entirely.

2. RunLoop: add maxOverflowRetries=3 cap on consecutive overflow/compact
   cycles. Previously, if the keep-recent messages were themselves too large
   for the context window, forceCompact kept returning true, i-- kept
   preventing the turn counter from advancing, and the loop ran forever.
   After 3 failed retries the loop now returns a descriptive error.

3. Compact: set ArchiveGeneration before the initial atomic write rather than
   in a second Save() call after both renames. Eliminates the crash window
   where the archive had new chunk IDs (stamped with newGeneration) but the
   archive_generation field on disk still showed the old value. The second
   Save call is removed.

4. search_session_archive: cap the model-supplied max_results at the
   configured maximum. Previously the model could request an unbounded number
   of results, bypassing the configured cap and potentially producing an
   oversized response payload.
1. runArchivePreHook: apply same in-place update fix as forceCompact (pass 3).
   Every invocation was replacing o.archiveSub with a new struct, causing the
   already-registered tools (search_session_archive, retrieve_archived_message)
   to hold a pointer to the first-ever ArchiveSubsystem. After the second
   threshold compaction fired via the pre-hook, all archive searches continued
   searching the archive from compaction #1 while the actual session archive had
   grown to compaction #2 and beyond.
   Also avoids replacing the search index on NoOp turns — the existing warmed
   index stays in place when nothing was compacted.

2. forceCompact: update session meta counters after emergency compaction so that
   'late session list -v' archive stats (CompactionCount, ArchivedMessageCount,
   LastCompactionAt) reflect emergency compactions, not just pre-hook ones.

3. Compact: add defensive ChunkSize <= 0 guard. ChunkSize=0 caused an infinite
   loop in the chunk iteration (start += 0 never advances). Config defaults are
   applied at the call sites, but the function itself was unguarded.
1. forceCompact: check res.LockHeld before injecting compaction notice.
   When another process holds the archive lock, Compact returns LockHeld=true
   with NoOp=false — not caught by the existing 'res.NoOp' guard. Code was
   falling through: appending a false '[System] N messages were moved' notice
   to history (with N=0 since nothing was archived), saving that spurious
   history, and returning true to tell RunLoop to retry the turn. Each overflow
   retry cycle consumed one overflowRetries slot with nothing actually compacted.

2. runArchivePreHook: same LockHeld fix. When the lock was held, the
   '!res.NoOp' guard evaluated to true (LockHeld!=NoOp), so the pre-hook
   was injecting the compaction notice and saving history even though zero
   messages had been archived. Also prevents pointlessly calling MarkDirty
   and swapping the search index when the archive was not modified.

3. runArchivePreHook: use RLock/RUnlock instead of Lock/Unlock when reading
   o.archiveSub at the start of the function. Write lock was used for a
   read-only access, unnecessarily blocking concurrent readers.
1. Compact: track NextSequence in local var; do not mutate input *SessionArchive.
   Previously 'archive.NextSequence++' mutated the passed-in pointer inside the
   chunk loop. 'newArchive := *archive' was only made after the loop completed,
   so if either os.Rename call failed, the caller's in-memory Archive.NextSequence
   was already advanced past the value on disk. Any subsequent compaction run
   would assign duplicate sequence numbers to new messages.
   Fix: introduce 'nextSeq := archive.NextSequence', increment that local var,
   and set 'newArchive.NextSequence = nextSeq' on the copy only.

2. Load sessionID: use archive.BaseSessionID(histPath) not o.id in both
   forceCompact and runArchivePreHook. The second argument to archive.Load is
   only used when the archive file does not exist yet and a fresh archive must
   be created. Using o.id ('main' for the default orchestrator) stored a wrong
   session_id ('main') in every newly-created archive file instead of the actual
   session token (e.g. 'session-20250501-abc123').

3. forceCompact: add svc.WarmUp() after emergency compaction. runArchivePreHook
   already called WarmUp; forceCompact was inconsistent — the first archive search
   after an emergency compaction would always incur a cold index-build penalty.

4. forceCompact: add retrieve_archived_message to the compaction notice injected
   into history. The pre-hook notice already mentioned both tools; forceCompact
   only mentioned search_session_archive, leaving the model unaware it could
   fetch a specific message by reference after an emergency compaction.
Messages with Role=user and Content starting with '[System]' are internal
notices injected by the archive pre-hook and forceCompact to inform the model
that history was compacted. There is no value in showing these to the user —
they clutter the chat view with implementation details. Skip them during
viewport rendering by appending an empty string to the render cache.
1. GenerateSessionMeta: skip [System] compaction notices for title/lastPrompt.
   After compaction the injected '[System] N messages were moved...' notice is
   the last user message in the active window. GenerateSessionMeta was picking
   it up as LastUserPrompt (and potentially as Title if it was the first user
   message in the kept window), corrupting 'late session list' output.
   Fix: skip user messages whose Content starts with '[System]' in both the
   forward (title) and backward (lastPrompt) scan loops.

2. UpdateSessionMetadata: preserve CreatedAt and archive counter fields.
   GenerateSessionMeta always returned CreatedAt=time.Now(), so every call to
   saveAndNotify (which is every message) overwrote the real session creation
   time with the current time. More critically, the archive counter fields
   (CompactionCount, ArchivedMessageCount, LastCompactionAt) set by the
   orchestrator's post-compaction block were being zeroed on the very next
   saveAndNotify call because GenerateSessionMeta has no access to those values.
   Fix: UpdateSessionMetadata now loads the existing on-disk meta and merges
   CreatedAt + archive counter fields before saving, so orchestrator-managed
   fields are preserved across message writes.
Adds 'late session prune' to clean up accumulated session files:

  late session prune --older-than <days>   delete sessions last updated > N days ago
  late session prune --keep-last <n>       keep only the N most recent sessions
  late session prune --dry-run             preview what would be deleted

Flags can be combined: --older-than runs first, then --keep-last trims
whatever remains. Both the active history file and the associated
archive / lock files are removed (via archive.DeleteFiles, same path as
'session delete'). Also updates quickstart doc with prune examples.
Other tools sharing the same sessions directory (e.g. sast-*) write
.meta.json files with different naming conventions. ListSessions was
picking them all up. Now only entries whose ID starts with 'session-'
are included, matching the naming format Late uses when creating new
sessions (session-YYYYMMDD-HHMMSS).
…age stub

runArchivePreHook previously created a new SearchService and called WarmUp()
unconditionally, including on every no-op pass (threshold not hit). On a
no-op with firstInit=false the svc was never assigned, so the full archive
re-index was thrown away. In a long session with hundreds of archived messages
this caused a measurable full re-scan on every user message.

Fix: move svc creation and WarmUp into the branches where svc is actually
assigned:
  - firstInit=true  → always build (first time tools are registered)
  - firstInit=false && real compaction → rebuild and assign
  - firstInit=false && no-op/lock-held → skip entirely

Also remove the dead 'tool.RegisterArchiveTools' reference in main.go.
Go does not require function references for linker retention; the comment
was misleading and the expression had no effect.
Copilot AI review requested due to automatic review settings May 6, 2026 23:53
@giveen
Copy link
Copy Markdown
Contributor Author

giveen commented May 6, 2026

Its a psuedo-RAG system, basically.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an opt-in session archive compaction subsystem that trims long-running sessions into an on-disk archive and exposes archive search/retrieval tools to agents (including subagents), plus related CLI/TUI/metadata updates.

Changes:

  • Introduces internal/archive (archive persistence, compaction, keyword search) and registers new tools search_session_archive / retrieve_archived_message when enabled.
  • Hooks archive compaction into the orchestrator run loop (threshold-based pre-hook + emergency compaction on context overflow) and records compaction counters in session metadata/UI.
  • Adds late session prune and expands subagent support with a new planner type + updated prompts.

Reviewed changes

Copilot reviewed 35 out of 36 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
internal/tui/view.go Hides system-injected compaction notices in the TUI history rendering.
internal/tool/subagent.go Allows spawning planner subagents via tool schema update.
internal/tool/powershell_analyzer.go Removes legacy PowerShell command analyzer.
internal/tool/permissions.go Removes legacy allow-list normalization helper.
internal/tool/implementations.go Switches shell analyzer selection to always use AST analyzer.
internal/tool/implementations_cmd_test.go Updates command tests after analyzer refactor/removals.
internal/tool/bash_analyzer.go Removes legacy Bash analyzer.
internal/tool/bash_analyzer_test.go Removes legacy Bash analyzer tests.
internal/tool/bash_analyzer_project_test.go Removes legacy Bash allow-list parsing tests.
internal/tool/ast/shadow.go Removes AST “shadow mode” implementation.
internal/tool/ast/policy.go Updates allow-list semantics (nil flag-set means “allow all flags”).
internal/tool/ast/feature_flag.go Removes AST rollout feature flags (shadow/enforcement).
internal/tool/ast_mode_test.go Updates AST-mode tests to reflect always-AST behavior.
internal/tool/ast_bridge.go Seeds built-in safe command allow-list into AST policy engine.
internal/tool/archive_tools.go Adds archive search/retrieval tools and registration helper.
internal/tool/archive_tools_test.go Adds unit tests for archive tools (availability, payload caps, safety header).
internal/tool/allowlist_parse.go Adds new allow-list key/flag extraction using shell AST parsing.
internal/tool/allowlist_parse_test.go Adds tests for allow-list parsing behavior.
internal/session/ttystyle.go Displays archive compaction metadata in verbose session output.
internal/session/session.go Skips compaction notices when generating session title/last prompt; preserves archive counters when updating meta.
internal/session/models.go Adds compaction fields to SessionMeta; filters ListSessions to session- IDs only.
internal/orchestrator/base.go Adds archive pre-hook + emergency compaction and registers archive tools; exposes archive subsystem to subagents.
internal/orchestrator/base_archive_test.go Adds tests for archive pre-hook enable/disable and tool registration behavior.
internal/executor/executor.go Adds context overflow retry hook + tool-call cycle detection; fixes indentation in tool-call execution gate.
internal/config/config.go Adds archive compaction config block with defaults + validation helpers.
internal/config/config_test.go Adds config tests for archive compaction defaults, parsing, and validation.
internal/assets/prompts/instruction-planning.md Documents archive tools availability for planner subagents.
internal/assets/prompts/instruction-coding.md Documents archive tools availability for coder subagents.
internal/archive/search.go Implements in-memory keyword search index + scoring/sorting.
internal/archive/compaction.go Implements compaction, locking, deduplication, atomic writes, and startup reconciliation.
internal/archive/archive.go Implements archive schema, load/save, helpers, and reconstruction utilities.
internal/archive/archive_test.go Adds extensive tests for archive persistence, compaction, locking, and search.
internal/agent/agent.go Implements planner subagent prompts/toolsets; inherits parent archive tools.
docs/quickstart.md Documents session pruning and archive compaction configuration.
cmd/late/main.go Adds session prune command; deletes archive files on session delete; redirects logs to file.
.gitignore Ignores rag_md.txt.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread internal/tui/view.go
Comment thread internal/tool/ast_bridge.go Outdated
Comment thread internal/tool/ast_bridge.go Outdated
Comment thread internal/archive/compaction.go Outdated
Comment thread internal/archive/search.go
Comment thread internal/executor/executor.go Outdated
Comment thread internal/session/session.go
Comment thread internal/tool/ast_bridge.go Outdated
Comment thread internal/tool/archive_tools.go Outdated
- Use SystemNotice flag on ChatMessage instead of [System] string prefix
  for filtering compaction notices in tui/view.go and session/session.go
- Remove 'find' and 'env' from Unix command whitelist in ast_bridge.go
  ('find' allows dangerous -exec flag; 'env' wraps arbitrary commands)
- Fix Windows whitelisted command seeding to use nil flag set (allow all
  flags) instead of empty map[string]bool{} which denied everything
- Move processAlive to platform-specific files: process_unix.go uses
  kill(pid,0) via syscall; process_windows.go stubs to true (relies on
  StaleAfterSeconds for recovery)
- Replace O(n²) insertion sort in search.go with sort.Slice (O(n log n))
- Hash toolCallSig arguments with SHA-256 to avoid large allocations for
  tools with kilobyte-scale payloads (write_file, etc.)
- Use pointer map for ArchivedMessage lookup in archive_tools.go to avoid
  copying large Message.Content on every retrieval call
@mlhher
Copy link
Copy Markdown
Owner

mlhher commented May 9, 2026

I will go over this after #62 is merged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants