v0.3.1: SSE streaming fix, token capture, output compression pipeline#8
Merged
Conversation
Split the 'tool | toolResult if age > 2' arm into two: - Dedup (unchanged): file reads with .rs/.py/.js patterns - Zone compress (NEW): all other tool results at age 3-4 keep first 300 + last 100 chars with '[compressed N chars]' marker Benchmark: 2,357 chars saved on 6-turn conversation (36% of body). API tokens unaffected (API returned 0 tokens — prob. max_tokens limit).
…lt compression sift_compress_tool_result() classifies each line of the tool result using reliary_sift::classify_line(), then: - Keeps first 25 and last 15 lines unconditionally (structural context) - Preserves Error/Summary lines from the middle zone - Preserves lines containing FAILED (test diagnostics) - Drops Code/Blank lines from the middle zone - Collapses adjacent blank lines Verification: 1,120 chars cargo test output → 813 chars (27% savings), all 8 FAILED lines preserved in compressed output. Current vs prior approach: - Zone truncation (prior): blind middle drop — kills any error in middle - Sift classification (this): keeps errors/summaries wherever they appear
New crate: reliary-output (~350 lines) porting sift classify.rs + filter.rs:
- classify_output_line(): Error/Warning/Summary/Progress/PrefixLine/Code detection
- strip_ansi(): comprehensive ANSI escape sequence stripping
- skeleton(): UUID→{uuid}, hex→{hash}, version→{ver}, timestamp→{time} normalization
- merge_error_blocks(): preserve first error line, summarize helper lines
- collapse_prefix_runs(): 30 Compiling crateX → [30 Compiling ...] (30 lines)
- collapse_ok_lines(): 20 test ... ok → [20 ok]
- compress_output(): combined pipeline, auto-detects patterns
Wired into proxy.rs tool result compression (ages 3-4):
- Replaced line-based zone truncation with full structural collapse
- Falls through to original text if compression can't beat 0 savings
Benchmark (6-turn cargo-build session):
- Body: 11,897 chars → compressed: 4,521 chars (62% savings, 7,376 chars)
- All error/summary lines preserved (verified via unit test)
- Compilation run collapsed, ok lines collapsed, error blocks preserved
New module: crates/reliary-agent/src/guard.rs (~250 lines) porting stria's
grammar-free structural edit guards with two functions:
1. check_diff(index_path, file_path, new_content) -> JSON
- Extracts identifiers from proposed new content via scan_identifiers()
- Queries phrase_occ table for old identifiers in the target file
- Tier 1: Detects new uppercase identifiers not in file → checks if
they're defined elsewhere → MISSING IMPORT warning
- Tier 2: Detects removed identifiers still referenced by other files
→ ORPHANED REFERENCE warning
2. read_validated(index_path, file_path, content) -> JSON
- Finds identifiers DEFINED (is_def >= 1) in this file
- Counts cross-file references for each
- Warns about 5 most-referenced: 'process_order' referenced by 7 files
- Prevents delete/rename mistakes before the edit starts
HTTP endpoints wired:
GET /check-diff?file=...&content=... — pre-edit structural guard
GET /read-validated?file=... — pre-read dependency guard
Uses existing reliary-search primitives: scan_identifiers, schema::unpack_is_def,
schema::open_existing_db, phrase_occ table lookup. No new schema or deps.
Adds resolve_index_paths() to try multiple relative path forms against the index. Uses porter_stem() on both new and old identifiers for proper comparison. Unit tests verify: - 379 orphan warnings detected on real index - Correct identification of cross-file references
Common identifiers (unwrap, usize, as_ref, clone, etc.) appear in 10+ files — these are library/std symbols, not project-specific symbols someone would 'orphan.' The document frequency query checks the phrase_occ table before generating each warning. Before filter: 379 warnings (noisy, includes 'unwrap' in 12 files) After filter: 269 warnings (project-specific, DF < 10) Clean edits: 0 false positives Unit tests: both pass against real index
…n support, request logging - routes.rs: scan_proxy_routes() checks ~/.reliary/proxy-routes.json before Pi configs/env vars. Fixed DEEPSEEK_API_KEY default upstream from DeepInfra to api.deepseek.com. - proxy.rs: role normalization translates 'developer'/'latest_reminder' to 'system' before forwarding. Fixes Pi Agent compatibility through proxy. - proxy.rs: logs per-request compression metrics to /tmp/reliary_proxy.jsonl for benchmarking.
- Streaming responses now pass through as raw SSE bytes instead of wrapping in Sse<> which double-prefixed 'data: data:' - Captures prompt_tokens/completion_tokens from the final usage chunk and logs to /tmp/reliary_proxy.jsonl for benchmarking - Streaming path logs: stream_usage events with billed tokens
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
8 commits:\n- Zone/tool-result compression at ages 3-4\n- Sift-based output compression (71-93% per result)\n- Stria guard.rs port (check_diff + read_validated)\n- Grammar-free DF noise filter\n- Proxy-routes.json support, role normalization\n- SSE streaming passthrough, token capture from usage chunk