Feature/multi credential failover 0.3.x by fengluodb · Pull Request #2554 · volcengine/OpenViking

fengluodb · 2026-06-10T13:45:45Z

Description

Related Issue

Type of Change

Bug fix (non-breaking change that fixes an issue)
New feature (non-breaking change that adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Documentation update
Refactoring (no functional changes)
Performance improvement
Test update

Changes Made

Testing

I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes
I have tested this on the following platforms:
- Linux
- macOS
- Windows

Checklist

My code follows the project's coding style
I have performed a self-review of my code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
Any dependent changes have been merged and published

Screenshots (if applicable)

Additional Notes

…KnowledgeParser.

Add OrderedCredentialSwitcher for N-credential failover, MultiCredentialVLM and FailoverEmbedder for credential switching, support automatic migration from legacy backup config

…itcher

Separate the monolithic PERMANENT error class so the switcher reacts to the actual root cause: - 400 (request-level parameter error) -> PERMANENT, fail-fast: same request fails on every credential of the same model. - 401/403/unauthorized/accountoverdue -> new AUTH class: credential-level, advances to the next credential in multi-credential mode. - new CONTENT_SAFETY class (moderation rejections) and INPUT_TOO_LARGE -> fail-fast: switching credentials cannot help. classify_api_error now checks CONTENT_SAFETY before PERMANENT so a moderation message containing "400" is not misclassified. On fail-fast the failover wrappers re-raise the original exception (preserving type/info) instead of wrapping it, so callers can react (e.g. truncate on input_too_large). AllCredentialsFailedError is reserved for chain exhaustion. The legacy PrimaryBackupSwitcher also switches on AUTH to keep existing backup behavior.

get_active_index() previously mutated state: when a failback threshold was met it would decrement the active index. Because observability properties (active_credential_index / active_credential_id) call it, merely reading the current credential for logging or metrics could accidentally advance the failback state machine. Split the concern: get_active_index() is now a pure read, and a new public maybe_failback() performs the one-step failback (and logs an info line when the active credential index changes). The request loops in MultiCredentialVLM and FailoverEmbedder call maybe_failback() at the top of each attempt, so the failback behavior is unchanged while pure reads no longer have side effects.

The failover loops exited on `idx >= n OR total_attempts >= total_max_retries` (default 10). With more than 10 credentials, or when failback churn inflated the attempt count, this could raise AllCredentialsFailedError before every credential had actually been tried, leaving lower-priority credentials unused. Remove total_max_retries entirely from MultiCredentialVLM and FailoverEmbedder: credential exhaustion is now decided solely by reaching the end of the chain (idx >= n), and per-credential retries remain the responsibility of each underlying instance via its own max_retries. The aggregated error tuple now records the failing credential index instead of the attempt counter. Adds a regression test covering more than 10 credentials all being tried.

encoding_format, model_path, cache_dir, enable_fusion, res_level and max_video_frames describe how a model runs, not which credential is used. All credentials of a single embedding model share the same model, so these belong on the parent EmbeddingModelConfig, not on each credential. Keeping them on the credential forced a `cred.X or config.X` merge in _create_failover_embedder, which silently dropped explicit falsy values (enable_fusion=False, res_level=0, max_video_frames=0) and fell back to the parent value. Remove these six fields from EmbeddingCredential and read them directly from the parent config when building per-credential embedders. id/provider/model/ api_key/api_base/api_version/ak/sk/region/host/extra_headers remain credential-level.

FailoverEmbedder.get_dimension() returned a hardcoded 2048 when the first embedder had no get_dimension(). That path is reached only when wrapping sparse embedders, which have no fixed dense dimension; returning a fabricated 2048 silently feeds a wrong dimension to callers (e.g. schema creation). Delegate to the first embedder and raise AttributeError when it has no get_dimension(), surfacing the misuse instead of hiding it.

Two issues in the cross-instance token usage merge: 1. Encapsulation: FailoverVLM / MultiCredentialVLM / FailoverEmbedder reached into other instances' private _token_tracker. Add a public token_tracker accessor on VLMBase and use it in the VLM mergers. 2. Double counting in FailoverEmbedder: embedders share a process-wide singleton token tracker (_get_token_tracker), so all wrapped embedders point at the same object. Merging N identical trackers inflated usage N-fold. Return a single instance's usage directly instead of merging.

Splitting 401/403/unauthorized/accountoverdue out of PERMANENT into the new AUTH class (commit b477b9f) regressed the circuit breaker: it only tripped immediately on PERMANENT/QUOTA_EXCEEDED, so auth errors no longer opened the breaker right away. For a single embedding instance an auth failure (key invalid / no permission / overdue) is persistent and retrying is pointless, so the breaker should still trip immediately. Add ERROR_CLASS_AUTH to the immediate-trip set and update the classification tests to assert the new AUTH class (403 still trips the breaker).

Without setting _last_switch_time, maybe_failback() retreats to idx 0 immediately because the default 0 timestamp is always older than the 600s timeout, so the unavailable last credential never actually gets exercised.

github-actions · 2026-06-10T13:47:15Z

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪
🏅 Score: 88
🧪 PR contains tests
🔒 No security concerns identified
✅ No TODO sections
🔀 Multiple PR themes Sub-PR theme: Add multi-credential failover for embeddings Relevant files: openviking_cli/utils/config/embedding_config.py openviking/models/embedder/init.py openviking/models/embedder/base.py tests/unit/test_embedding_credential_model_override.py tests/unit/test_failover_embedder_permanent_advance.py Sub-PR theme: Add multi-credential failover for VLM Relevant files: openviking_cli/utils/config/vlm_config.py openviking/models/vlm/base.py openviking/models/vlm/init.py tests/unit/test_vlm_failover.py Sub-PR theme: Add ordered credential switcher and error classification improvements Relevant files: openviking/utils/model_retry.py openviking/utils/exceptions.py tests/unit/test_ordered_credential_switcher.py tests/unit/test_model_retry.py tests/unit/test_circuit_breaker.py
⚡ No major issues detected

github-actions · 2026-06-10T13:48:59Z

PR Code Suggestions ✨

Explore these optional code suggestions:

Category	Suggestion	Impact
General	Use project-standard get_logger for MultiCredentialVLM Use the project's standard `get_logger` utility instead of the raw `logging.getLogger` to ensure consistent logging behavior (structured logging, log levels, etc.) across the codebase. openviking/models/vlm/base.py [619-622] self._vlm_instances = vlm_instances self._credential_ids = credential_ids -self._logger = logging.getLogger(__name__) +self._logger = get_logger(__name__) self._switcher = OrderedCredentialSwitcher( Suggestion importance[1-10]: 5 __ Why: The suggestion replaces raw `logging.getLogger` with the project-standard `get_logger` utility (already imported in the file), ensuring consistent logging behavior across the codebase. This is a valid, moderate-impact code quality improvement.	Low

chenxiaobin-lang and others added 20 commits June 3, 2026 10:49

feat: arserRouter: Route parsing requests between ParserRegistry and …

9a4ca36

…KnowledgeParser.

feat: parse_api insert ov_conf

85cf503

feat: drop chunks resp

4fecf4e

feat: implement multi-credential priority call design

40cb319

Add OrderedCredentialSwitcher for N-credential failover, MultiCredentialVLM and FailoverEmbedder for credential switching, support automatic migration from legacy backup config

refactor: deduplicate AllCredentialsFailedError and fix logging in sw…

24a3164

…itcher

chore: fix code formatting for lint compliance

a038d2e

chore: fix remaining code formatting

ae87af6

fix

4d5cd53

fix

e55ef96

fix

81fcb81

test: bump _last_switch_time when forcing active_idx in ring tests

400fa2e

Without setting _last_switch_time, maybe_failback() retreats to idx 0 immediately because the default 0 timestamp is always older than the 600s timeout, so the unavailable last credential never actually gets exercised.

format

9bd1995

fix

0e76bb9

github-project-automation Bot added this to OpenViking project Jun 10, 2026

github-project-automation Bot moved this to Backlog in OpenViking project Jun 10, 2026

chenxiaobin-lang and others added 5 commits June 11, 2026 17:49

feat: match response api (#2559)

4c8299d

feat: tool understanding api

bafc804

feat: add understanding api

9885d24

fix: add dimension valid

4abdd2d

format

f15f882

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/multi credential failover 0.3.x#2554

Feature/multi credential failover 0.3.x#2554
fengluodb wants to merge 25 commits into
release/0.3.xfrom
feature/multi-credential-failover-0.3.x

fengluodb commented Jun 10, 2026

Uh oh!

github-actions Bot commented Jun 10, 2026

Uh oh!

github-actions Bot commented Jun 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

fengluodb commented Jun 10, 2026

Description

Related Issue

Type of Change

Changes Made

Testing

Checklist

Screenshots (if applicable)

Additional Notes

Uh oh!

github-actions Bot commented Jun 10, 2026

PR Reviewer Guide 🔍

Uh oh!

github-actions Bot commented Jun 10, 2026

PR Code Suggestions ✨

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants