feat: add LiteLLM generator#458
Open
RheagalFire wants to merge 2 commits intoweaviate:mainfrom
Open
Conversation
There was a problem hiding this comment.
Orca Security Scan Summary
| Status | Check | Issues by priority | |
|---|---|---|---|
| Infrastructure as Code | View in Orca | ||
| SAST | View in Orca | ||
| Secrets | View in Orca | ||
| Vulnerabilities | View in Orca |
|
To avoid any confusion in the future about your contribution to Weaviate, we work with a Contributor License Agreement. If you agree, you can simply add a comment to this PR that you agree with the CLA so that we can merge. |
Author
|
@thomashacker Requesting a review. |
Author
|
I agree with the CLA. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Add LiteLLMGenerator in goldenverba/components/generation/LiteLLMGenerator.py that routes RAG answers through litellm.acompletion() to 100+ providers (OpenAI, Anthropic, Bedrock, Azure, Vertex, Gemini, Ollama, OpenRouter, Groq, DeepSeek, etc.) using provider-native API keys.
Registered in goldenverba/components/managers.py in both the default and hosted generators lists so it shows up in the UI generator dropdown.
New litellm optional extra in setup.py so the base install stays lean; users install with pip install 'goldenverba[litellm]'.
Prior art
Searched the repo for existing LiteLLM-related work before drafting this PR:
generate_streaminterface did not exist in its current shape).OPENAI_BASE_URLso the OpenAI generator hits a LiteLLM proxy). That workaround requires users to run a separate LiteLLM proxy server and still shows up as the OpenAI generator in the UI. A later comment on the same issue (July 2024, @tan-yong-sheng) askedwhether LiteLLM embedding models are supported; no follow-up was posted. This PR ships the proper first-class generator that @priamai originally requested (UI model dropdown, no external proxy required).
litellm.litellmingoldenverba/source before this PR.Motivation
Verba currently ships 8 provider-specific generators (OpenAI, Anthropic, Cohere, Gemini, Groq, Novita, Ollama, Upstage). Each new provider requires a new file plus ongoing maintenance. LiteLLM unifies 100+ providers behind a single interface, so users pick any of them with a model-name prefix and LiteLLM handles routing and auth.
Changes
Testing and Usage
1. Unit tests for the new generator:
pytest goldenverba/tests/test_litellm_generator.py -vWhat they cover: metadata/config surface the UI renders,
prepare_messagesstructure (system message + user query with context),litellm.acompletionreceives the model/stream/api_key/api_base the user configured, blank credentials are omitted (so LiteLLM falls back to provider-specific env vars likeANTHROPIC_API_KEY,AWS_*), missing extra raises aclean
ImportErrorpointing atgoldenverba[litellm].2. Format check:
black --checkon the three new/modified files ->3 files would be left unchanged.3. Live E2E against Azure OpenAI (exercises full RAG answer path: UI ->
GeneratorManager.generate_stream->LiteLLMGenerator.generate_stream->litellm.acompletion-> Azure OpenAI Chat Completions -> streamed token parsing):-> LiteLLM model: azure/gpt-4o
-> api_base: https://.openai.azure.com
[stream] Hello
[stream] !
[stream] Verba
[stream] LiteLLM
[stream] OK
[stream] (finish_reason=stop)
[result] 4/7 chunks captured, finish_reason=stop
[pass] live Azure gpt-4o streamed via LiteLLM OK
This proves the integration chain end to end: config forwarded correctly,
litellm.acompletiondispatches with the Azure-prefixed model +api_key+api_base, streamed deltas are unwrapped into the{"message", "finish_reason"}dicts Verba's UI expects.Risk / Compatibility
OpenAIGenerator,AnthropicGenerator, and all other generators are untouched.litellmis an optional extra. Base installs unaffected. When the extra is not installed, the UI availability check (requires_library=["litellm"]->importlib.import_moduleinverba_manager.verify_installed_libraries) marks the generator as unavailable, so it is simply hidden from the dropdown.managers.pyincludeLiteLLMGenerator(), matching the pattern of the other lightweight generators (Anthropic, Cohere, Upstage).Example usage
Install the optional extra:
pip install 'goldenverba[litellm]'
Export a provider-specific key (LiteLLM auto-resolves per model prefix):
export ANTHROPIC_API_KEY=sk-ant-...
Or for Azure OpenAI:
export AZURE_API_KEY=...
export AZURE_API_BASE=https://.openai.azure.com
export AZURE_API_VERSION=2025-01-01-preview
Launch Verba, pick LiteLLM from the Generator dropdown in the UI, and set the Model field to any LiteLLM-supported prefix: