Response compression: compress API response before returning to agent by alderpath · Pull Request #7 · Reliary/reliary-agent

alderpath · 2026-06-14T11:05:26Z

Compresses the assistant message content in API responses before returning to the agent. Each response's reasoning blocks get compressed by the code-block-aware compressor, saving 2-10% on wire size per turn. Compounds across turns as the compressed responses are reused in conversation history.

New compress_response_body() function: - Deserializes the API JSON response - Runs compress_assistant_text() on each choice's message content - Code-block-aware: prose sections compressed, code blocks verbatim - Accepts any positive savings (no minimum threshold) - Returns modified body + chars saved in x-reliaty-response-saved header Also fixed compress_prose_inline threshold: accepts any 10+ char savings (was requiring 85% of original threshold). Verified: fires on real LLM responses (~2% savings on reasoning output).

alderpath added 2 commits June 14, 2026 04:21

fix: code-block-aware compression

c3f533f

alderpath merged commit 52c12f7 into master Jun 14, 2026
2 of 4 checks passed

alderpath deleted the response-compression branch June 14, 2026 11:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Response compression: compress API response before returning to agent#7

Response compression: compress API response before returning to agent#7
alderpath merged 2 commits into
masterfrom
response-compression

alderpath commented Jun 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

alderpath commented Jun 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant