Skip to content

v0.3.1: SSE streaming fix, token capture, output compression pipeline#8

Merged
alderpath merged 8 commits into
masterfrom
output-compression
Jun 15, 2026
Merged

v0.3.1: SSE streaming fix, token capture, output compression pipeline#8
alderpath merged 8 commits into
masterfrom
output-compression

Conversation

@alderpath

Copy link
Copy Markdown
Contributor

8 commits:\n- Zone/tool-result compression at ages 3-4\n- Sift-based output compression (71-93% per result)\n- Stria guard.rs port (check_diff + read_validated)\n- Grammar-free DF noise filter\n- Proxy-routes.json support, role normalization\n- SSE streaming passthrough, token capture from usage chunk

Split the 'tool | toolResult if age > 2' arm into two:
- Dedup (unchanged): file reads with .rs/.py/.js patterns
- Zone compress (NEW): all other tool results at age 3-4
  keep first 300 + last 100 chars with '[compressed N chars]' marker

Benchmark: 2,357 chars saved on 6-turn conversation (36% of body).
API tokens unaffected (API returned 0 tokens — prob. max_tokens limit).
…lt compression

sift_compress_tool_result() classifies each line of the tool result using
reliary_sift::classify_line(), then:
- Keeps first 25 and last 15 lines unconditionally (structural context)
- Preserves Error/Summary lines from the middle zone
- Preserves lines containing FAILED (test diagnostics)
- Drops Code/Blank lines from the middle zone
- Collapses adjacent blank lines

Verification: 1,120 chars cargo test output → 813 chars (27% savings),
all 8 FAILED lines preserved in compressed output.

Current vs prior approach:
- Zone truncation (prior): blind middle drop — kills any error in middle
- Sift classification (this): keeps errors/summaries wherever they appear
New crate: reliary-output (~350 lines) porting sift classify.rs + filter.rs:
- classify_output_line(): Error/Warning/Summary/Progress/PrefixLine/Code detection
- strip_ansi(): comprehensive ANSI escape sequence stripping
- skeleton(): UUID→{uuid}, hex→{hash}, version→{ver}, timestamp→{time} normalization
- merge_error_blocks(): preserve first error line, summarize helper lines
- collapse_prefix_runs(): 30 Compiling crateX → [30 Compiling ...] (30 lines)
- collapse_ok_lines(): 20 test ... ok → [20 ok]
- compress_output(): combined pipeline, auto-detects patterns

Wired into proxy.rs tool result compression (ages 3-4):
- Replaced line-based zone truncation with full structural collapse
- Falls through to original text if compression can't beat 0 savings

Benchmark (6-turn cargo-build session):
- Body: 11,897 chars → compressed: 4,521 chars (62% savings, 7,376 chars)
- All error/summary lines preserved (verified via unit test)
- Compilation run collapsed, ok lines collapsed, error blocks preserved
New module: crates/reliary-agent/src/guard.rs (~250 lines) porting stria's
grammar-free structural edit guards with two functions:

1. check_diff(index_path, file_path, new_content) -> JSON
   - Extracts identifiers from proposed new content via scan_identifiers()
   - Queries phrase_occ table for old identifiers in the target file
   - Tier 1: Detects new uppercase identifiers not in file → checks if
     they're defined elsewhere → MISSING IMPORT warning
   - Tier 2: Detects removed identifiers still referenced by other files
     → ORPHANED REFERENCE warning

2. read_validated(index_path, file_path, content) -> JSON
   - Finds identifiers DEFINED (is_def >= 1) in this file
   - Counts cross-file references for each
   - Warns about 5 most-referenced: 'process_order' referenced by 7 files
   - Prevents delete/rename mistakes before the edit starts

HTTP endpoints wired:
  GET /check-diff?file=...&content=... — pre-edit structural guard
  GET /read-validated?file=... — pre-read dependency guard

Uses existing reliary-search primitives: scan_identifiers, schema::unpack_is_def,
schema::open_existing_db, phrase_occ table lookup. No new schema or deps.
Adds resolve_index_paths() to try multiple relative path forms
against the index. Uses porter_stem() on both new and old
identifiers for proper comparison. Unit tests verify:
- 379 orphan warnings detected on real index
- Correct identification of cross-file references
Common identifiers (unwrap, usize, as_ref, clone, etc.) appear in
10+ files — these are library/std symbols, not project-specific
symbols someone would 'orphan.' The document frequency query
checks the phrase_occ table before generating each warning.

Before filter: 379 warnings (noisy, includes 'unwrap' in 12 files)
After filter:  269 warnings (project-specific, DF < 10)
Clean edits:   0 false positives
Unit tests:    both pass against real index
…n support, request logging

- routes.rs: scan_proxy_routes() checks ~/.reliary/proxy-routes.json before
  Pi configs/env vars. Fixed DEEPSEEK_API_KEY default upstream from DeepInfra
  to api.deepseek.com.
- proxy.rs: role normalization translates 'developer'/'latest_reminder' to
  'system' before forwarding. Fixes Pi Agent compatibility through proxy.
- proxy.rs: logs per-request compression metrics to /tmp/reliary_proxy.jsonl
  for benchmarking.
- Streaming responses now pass through as raw SSE bytes instead of
  wrapping in Sse<> which double-prefixed 'data: data:'
- Captures prompt_tokens/completion_tokens from the final usage chunk
  and logs to /tmp/reliary_proxy.jsonl for benchmarking
- Streaming path logs: stream_usage events with billed tokens
@alderpath alderpath merged commit 4eb4f9e into master Jun 15, 2026
3 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant