Detect equivalent models between providers#305
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6a0b40854b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "Codex (@codex) review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "Codex (@codex) address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b0d9bdddf7
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "Codex (@codex) review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "Codex (@codex) address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f46669b5fc
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "Codex (@codex) review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "Codex (@codex) address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 5c9b327b96
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "Codex (@codex) review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "Codex (@codex) address that feedback".
…d to have seperate ones" This reverts commit f46669b.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f3e2c0ed0c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "Codex (@codex) review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "Codex (@codex) address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 9817f7e122
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "Codex (@codex) review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "Codex (@codex) address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0248a518f1
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "Codex (@codex) review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "Codex (@codex) address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 73fc1c0993
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "Codex (@codex) review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "Codex (@codex) address that feedback".
Ken Jiang (knjiang)
left a comment
There was a problem hiding this comment.
ideally there's less code in mod.rs so we don't have a mega file but lgtm
we should take a second pass at the memory issue to see if it's real.
| impl OverlayModelCatalog { | ||
| pub fn new(base: Arc<ModelCatalog>, custom: ModelCatalog) -> Self { | ||
| let custom_model_names: HashSet<String> = custom.models.keys().cloned().collect(); | ||
| let visible_models = base | ||
| .models | ||
| .keys() | ||
| .filter(|name| !custom_model_names.contains(*name)) | ||
| .chain(custom.models.keys()) | ||
| .cloned() | ||
| .collect(); | ||
| let mut equivalence_edges: HashMap<String, Vec<String>> = base | ||
| .equivalent_models | ||
| .iter() | ||
| .filter(|(name, _)| !custom_model_names.contains(*name)) | ||
| .map(|(name, equivalents)| { | ||
| ( | ||
| name.clone(), | ||
| equivalents | ||
| .iter() | ||
| .filter(|equivalent| !custom_model_names.contains(*equivalent)) | ||
| .cloned() | ||
| .collect(), | ||
| ) | ||
| }) | ||
| .collect(); | ||
| equivalence_edges.extend(custom.equivalent_models.clone()); | ||
| let equivalence_index = build_equivalence_index(visible_models, &equivalence_edges); | ||
| Self { | ||
| base, | ||
| custom, | ||
| equivalence_index, | ||
| } | ||
| } |
There was a problem hiding this comment.
codex tells me this may increase memory:
Gateway builds a router from the resolved secrets on the request path. For any request with custom model metadata, this constructor clones every base catalog model name and rebuilds a merged equivalence index even though the base catalog already has a shared index. This is small today, but it makes custom-model traffic O(base catalog size) in allocations. Prefer delegating base lookups to `base.equivalent_model_names()` and only indexing custom overlay edges plus touched base targets.
There was a problem hiding this comment.
yeah... this is real upon second glance, I refactored here so that we only ever rebuild the equivalnce index for overlay models
| catalog.equivalent_models = equivalent_models; | ||
| catalog.validate_equivalent_models()?; | ||
| catalog.rebuild_equivalence_index(); |
There was a problem hiding this comment.
can we factor tihs stuff out to like fallback.rs? this feels like a smell to where we can just have one function that lets us resolve fallback models or smth.
There was a problem hiding this comment.
pulled this out into a single function on the ModelCatalog + moved helpers to a separate file
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ba087683dc
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "Codex (@codex) review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "Codex (@codex) address that feedback".
| self.custom.get(name).or_else(|| self.base.get(name)) | ||
| } | ||
|
|
||
| pub fn equivalent_model_names(&self, name: &str) -> Vec<String> { |
There was a problem hiding this comment.
is this more like find_fallback_models?
There was a problem hiding this comment.
ah yeah that is a better name
fixing rn
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 9a2743f316
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "Codex (@codex) review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "Codex (@codex) address that feedback".
| Err(err) => return Err(err.into()), | ||
| }; | ||
|
|
||
| let transformed = override_payload_model(transformed, actual_format, &spec.model); |
There was a problem hiding this comment.
Apply model rewrites before Bedrock preparation
When an equivalent fallback routes to Bedrock Converse, this rewrite runs after the Bedrock early-return path and is never applied. prepare_bedrock_request preserves the source request model whenever it is already present, so a normal Chat Completions request for the primary model transformed to Converse can still contain modelId for the primary while the Bedrock URL is built from the fallback spec.model, producing an inconsistent request for failover routes.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
ugh. we have special logic for bedrock that bypasses the transform request logic :( I'm refactoring that instead of working around this.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 8674f412ce
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "Codex (@codex) review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "Codex (@codex) address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0eb3148d27
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "Codex (@codex) review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "Codex (@codex) address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 8189e2d0b4
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "Codex (@codex) review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "Codex (@codex) address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 199be5b192
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "Codex (@codex) review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "Codex (@codex) address that feedback".
| if seen.contains(fallback_alias) { | ||
| continue; |
There was a problem hiding this comment.
Allow same-provider fallback model routes
When fallback_models is used to degrade to another model on the same credential, e.g. gpt-4o → gpt-4o-mini with both catalog entries available on OPENAI_API_KEY, the primary route added above records that provider alias in seen; this check then skips the alias before any equivalent model specs are considered. The returned route list contains only the primary model, so retry/failover can never use the configured fallback model on the same provider alias.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
not worried about this for now, we can fix if a customer asks for it
…equivalent-model-endpoints-btwn-providers
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ce6b753ecc
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "Codex (@codex) review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "Codex (@codex) address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 41e2a9f84f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "Codex (@codex) review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "Codex (@codex) address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0985268e48
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "Codex (@codex) review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "Codex (@codex) address that feedback".
| ProviderFormat::Google | ||
| | ProviderFormat::Converse | ||
| | ProviderFormat::BedrockAnthropic | ||
| | ProviderFormat::VertexAnthropic | ||
| | ProviderFormat::Unknown => None, |
There was a problem hiding this comment.
Rewrite Google body models on passthrough failover
When a failover route selects an equivalent Google/Vertex GenerateContent model and the incoming request is already ProviderFormat::Google, transform_request can pass the payload through and the router relies on this helper to patch the body model. Fresh evidence after the previous thread is that the current body_model_field still puts ProviderFormat::Google in the no-rewrite arm, so the upstream URL is built from the fallback spec.model while the serialized payload can still contain the primary top-level model that GoogleAdapter::request_from_universal emits, producing an inconsistent request for same-format Google failovers.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
I don't think this is a problem, valid vertex requests can look like this.
…equivalent-model-endpoints-btwn-providers
…ithub.com/braintrustdata/lingua into equivalent-model-endpoints-btwn-providers
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 5e9689bf94
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "Codex (@codex) review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "Codex (@codex) address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: bbd19500ee
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "Codex (@codex) review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "Codex (@codex) address that feedback".
| if request.model.is_none() { | ||
| request.model = Some(spec.model.clone()); | ||
| } | ||
| request.model = Some(spec.model.clone()); |
There was a problem hiding this comment.
Honor preserve_body_model for Bedrock transforms
When callers pass preserve_body_model=true and the selected route format is Converse or BedrockAnthropic, prepare_provider_request returns through the Bedrock helper before applying RequestPreparationOptions, and this line unconditionally replaces the source model with spec.model. A Chat Completions or Anthropic body converted to Bedrock therefore still serializes modelId for the route model, so the new opt-out behaves differently for Bedrock than for the other provider formats.
Useful? React with 👍 / 👎.
see https://github.com/braintrustdata/braintrust/pull/15721