feat(gateway): persist durable prompt deltas#747
Merged
Conversation
Contributor
Codecov Results 📊✅ Patch coverage is 86.88%. Project has 5707 uncovered lines. Files with missing lines (6)
Coverage diff@@ Coverage Diff @@
## main #PR +/-##
==========================================
+ Coverage 64.28% 64.43% +0.15%
==========================================
Files 103 103 —
Lines 15857 16045 +188
Branches 10967 11093 +126
==========================================
+ Hits 10193 10338 +145
- Misses 5664 5707 +43
- Partials 1247 1254 +7Generated by Codecov Action |
# Conflicts: # packages/gateway/test/cache-stability.e2e.test.ts
Contributor
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Knowledge/LTM can change materially while a session is already relying on a warm prompt-cache prefix. Rewriting
system[2]to surface those changes destroys cache stability, but transient one-turn delta messages are also incorrect: the agent is stateless, so once those bytes disappear they are as if they never existed, and their disappearance changes the cached prefix.Closes #740. This implements the knowledge-delta channel proposed in #740, but with a durable/persisted model instead of a volatile one-turn delta (a volatile delta would itself bust the cache when it disappears).
Ground rules
Design
Durable prompt deltas:
session_prompt_deltaswith the minimal persisted shape:session_idseqproject_idselector(JSON)content(exact JSON message payload)system[2]is pinned, keeps the old pinnedsystem[2]bytes and appends a durable user-message prompt delta near the conversation tail.system[2]bytes and appends a durableSuperseded Long-term Knowledgedelta listing removed short IDs.Additional Changed Knowledge (truncated)rather than silently dropped.ltm_cache_text/ltm_pin_textbytes while advancingltm_pin_keys, so restart does not rewritesystem[2].moveSessions()carriessession_prompt_deltasto the target project.ltm_delta_json/ pending-queue path from the gateway.Design questions from #740
selector {target:"messages", insertAt}).Lore knowledge update; removals listed underSuperseded Long-term Knowledgeso the model ignores stale pinned IDs.[019eb...]).Verification
122 files,2999 passed | 6 skipped.script/build-binary-sea.tsignoreNodeOptionstype error remains outside this PR.system[1]/system[2]bytes, and that durable deltas replay across a simulated gateway restart.