feat(execute): wire /execute into the substitution engine + pinned-param honesty (#39)#39
Merged
Merged
Conversation
…ram honesty (#39) POST /execute (managed path) now uses the same multi-hop run_with_failover engine as /proxy instead of its single-hop SERVICE_ALTERNATIVES fallback. Stays on 0.9.1. - Managed-key calls: classify (settlement-aware _try_execute_managed_ex) -> engine (retry-first -> idempotency gate -> chain -> bill-served-only -> log). BYOK calls are NOT substituted (user's own key) — refunded + surfaced as before. - Pinned-param honesty (decided policy): when a cross-provider substitute cannot honor a pinned provider-specific param (an explicit LLM model), the engine SURFACES a clean error (primary refunded) — it does NOT strip the param and does NOT silently fall back to the substitute's default model. Unpinned calls keep self-healing. Implemented in the shared engine, so /proxy gets it too. (A curated model-equivalence map for tier-equivalent substitution is a separate follow-up.) - Self-heal surface: X-Wayforth-Served-By + X-Wayforth-Fallback headers + served_by/ fallback in the response body. Group exhaustion -> clean 502 with providers tried. - Charges only the served provider; failed hops net to zero in the ledger. 26 engine unit tests green (incl. pinned-model surfaces-not-substituted + unpinned-self-heals). No version bump (0.9.2 reserved for A2A). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Wires
POST /execute(managed path) into the same multi-hoprun_with_failoverengine as /proxy. Stays on 0.9.1 (0.9.2 reserved for A2A).X-Wayforth-Served-By+X-Wayforth-Fallbackheaders +served_by/fallbackbody. Exhaustion → clean 502.26 engine unit tests green (incl.
pinned_model_surfaces_not_substituted,unpinned_llm_self_heals). Live failover check on /execute to follow post-deploy.Curated model-equivalence map (tier-equivalent pinned-model self-heal +
X-Wayforth-Substituted-Model) is the queued near-term follow-up.🤖 Generated with Claude Code
EOF
)