Skip to content

LLMCompletionResponse rejects non-OpenAI service_tier values from LiteLLM providers #2389

@sfgartland

Description

@sfgartland

Bug Report

Describe the bug

When using GraphRAG with non-OpenAI LLM providers routed through LiteLLM (e.g. Gemini via OpenRouter), the extract_graph workflow fails with a pydantic validation error:

Error: 1 validation error for LLMCompletionResponse
service_tier
  Input should be 'auto', 'default', 'flex', 'scale' or 'priority'
  For further information visit https://errors.pydantic.dev/2.13/v/literal_error

Root cause

LLMCompletionResponse in graphrag_llm/types/types.py extends openai.types.chat.ChatCompletion, which defines:

service_tier: Optional[Literal["auto", "default", "flex", "scale", "priority"]] = None

In graphrag_llm/completion/lite_llm_completion.py, the LiteLLM ModelResponse is converted via:

return LLMCompletionResponse(**response.model_dump())

When the upstream provider returns a service_tier value outside the OpenAI literal (e.g. Gemini via OpenRouter returns a non-standard string), pydantic rejects it during model construction.

Steps to reproduce

  1. Configure GraphRAG to use a non-OpenAI model via LiteLLM (e.g. gemini/gemini-2.5-flash-preview through OpenRouter)
  2. Run indexing on any document
  3. The create_base_text_units workflow succeeds, but extract_graph fails with the validation error above

Expected behavior

service_tier is an OpenAI-specific field that GraphRAG does not use. Non-standard values should not cause indexing to fail.

Suggested fix

Either:

  1. Widen the type — override service_tier in LLMCompletionResponse to accept str | None instead of the strict literal
  2. Strip the field — pop service_tier from response.model_dump() before constructing LLMCompletionResponse in lite_llm_completion.py
  3. Use model_validate with strict=False or configure the model to ignore extra/non-conforming fields

Option 2 is the most minimal:

# In _base_completion and _base_completion_async:
if isinstance(response, ModelResponse):
    dump = response.model_dump()
    dump.pop("service_tier", None)
    return LLMCompletionResponse(**dump)

Environment

  • graphrag >= 2.0
  • graphrag-llm (latest as of May 2026)
  • LiteLLM routing to Gemini via OpenRouter
  • Python 3.12
  • pydantic 2.13

Workaround

Monkey-patch LLMCompletionResponse at application startup:

from graphrag_llm.types import LLMCompletionResponse
LLMCompletionResponse.model_fields["service_tier"].annotation = str | None
LLMCompletionResponse.model_rebuild()

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions