[codex] add typed generate_text provider adapters by hynky1999 · Pull Request #145 · macrodata-labs/refiner

Hynek Kydlíček (hynky1999) · 2026-05-21T13:58:07Z

Purpose

Add a Vercel-style typed generate_text inference surface while preserving the existing raw generate escape hatch.

Changes

Add TypedDict message/content-part types for text, image, and file inputs.
Add provider-specific message conversion for OpenAI-compatible chat, OpenAI Responses, Google/Gemini, and Anthropic.
Add native endpoint providers for Google, OpenAI Responses, and Anthropic.
Normalize in-memory media into each provider wire format, including Gemini inlineData for video bytes.
Document the new API and provider-specific providerOptions usage.

Validation

uv run pytest tests/test_inference.py
uv run ty check
uv run ruff check --force-exclude src/refiner/inference tests/test_inference.py docs/inference.md
uv run pytest (644 passed)

gemini-code-assist

Code Review

This pull request introduces a new generate_text API for typed, multimodal inference, adding native support for Google Gemini, Anthropic Messages, and OpenAI Responses APIs. The changes include a message conversion layer to translate canonical Refiner messages into provider-specific formats, along with new client implementations and updated documentation. Feedback identifies a potential issue where sending providerOptions as a top-level key to OpenAI-compatible endpoints could cause request failures. Additionally, a suggestion was made to reduce code duplication in the Google client by using a shared HTTP helper function.

gemini-code-assist · 2026-05-21T14:05:04Z

+            if providerOptions is not None and not isinstance(
+                provider,
+                GoogleEndpointProvider
+                | AnthropicEndpointProvider
+                | OpenAIResponsesProvider,
+            ):
+                payload["providerOptions"] = providerOptions


For OpenAIEndpointProvider and VLLMProvider, including providerOptions as a top-level key in the request payload is likely to cause 400 Bad Request errors from most OpenAI-compatible endpoints, as they typically do not recognize this field. Since the relevant options (like reasoningEffort) are already extracted and normalized into the payload in previous steps (lines 93-94, 116-117), this assignment should be removed for these providers.

gemini-code-assist · 2026-05-21T14:05:04Z

+    async def generate_text(self, payload: Mapping[str, Any]) -> InferenceResponse:
+        response_json = await self._post_json(
+            f"{_google_model_path(self.model)}:generateContent",
+            payload,
+            operation="google generation",
+        )
+        if not isinstance(response_json, Mapping):
+            raise RuntimeError("google generation response must be a JSON object")
+        return _parse_google_inference_response(response_json)
+
+    async def _post_json(
+        self,
+        endpoint_path: str,
+        payload: Mapping[str, Any],
+        *,
+        operation: str,
+    ) -> Any:
+        client = self._ensure_client()
+        for attempt in range(_OPENAI_ENDPOINT_MAX_RETRIES):
+            try:
+                response = await client.post(endpoint_path, json=dict(payload))
+                break
+            except (
+                ConnectionError,
+                OSError,
+                asyncio.TimeoutError,
+                httpx.NetworkError,
+                httpx.TimeoutException,
+            ) as err:
+                if attempt + 1 >= _OPENAI_ENDPOINT_MAX_RETRIES:
+                    message = (
+                        f"{operation} request failed after "
+                        f"{_OPENAI_ENDPOINT_MAX_RETRIES} attempts: "
+                        f"{type(err).__name__}: {err}"
+                    )
+                    raise RuntimeError(message) from err
+                await asyncio.sleep(_retry_delay_seconds(attempt))
+        else:
+            raise RuntimeError(f"{operation} request failed without a response")
+        try:
+            response.raise_for_status()
+        except httpx.HTTPStatusError as err:
+            detail = ""
+            try:
+                detail = str(err.response.json())
+            except ValueError:
+                detail = err.response.text.strip()
+            message = f"{operation} request failed with HTTP {err.response.status_code}"
+            if detail:
+                message = f"{message}: {detail}"
+            raise RuntimeError(message) from err
+        return response.json()


The _post_json method in _GoogleEndpointClient is identical to the _post_json_with_retries helper function defined later in this file. To improve maintainability and reduce code duplication, _GoogleEndpointClient should use the helper function.

Suggested change

async def generate_text(self, payload: Mapping[str, Any]) -> InferenceResponse:

response_json = await self._post_json(

f"{_google_model_path(self.model)}:generateContent",

payload,

operation="google generation",

)

if not isinstance(response_json, Mapping):

raise RuntimeError("google generation response must be a JSON object")

return _parse_google_inference_response(response_json)

async def _post_json(

self,

endpoint_path: str,

payload: Mapping[str, Any],

*,

operation: str,

) -> Any:

client = self._ensure_client()

for attempt in range(_OPENAI_ENDPOINT_MAX_RETRIES):

try:

response = await client.post(endpoint_path, json=dict(payload))

break

except (

ConnectionError,

OSError,

asyncio.TimeoutError,

httpx.NetworkError,

httpx.TimeoutException,

) as err:

if attempt + 1 >= _OPENAI_ENDPOINT_MAX_RETRIES:

message = (

f"{operation} request failed after "

f"{_OPENAI_ENDPOINT_MAX_RETRIES} attempts: "

f"{type(err).__name__}: {err}"

)

raise RuntimeError(message) from err

await asyncio.sleep(_retry_delay_seconds(attempt))

else:

raise RuntimeError(f"{operation} request failed without a response")

try:

response.raise_for_status()

except httpx.HTTPStatusError as err:

detail = ""

try:

detail = str(err.response.json())

except ValueError:

detail = err.response.text.strip()

message = f"{operation} request failed with HTTP {err.response.status_code}"

if detail:

message = f"{message}: {detail}"

raise RuntimeError(message) from err

return response.json()

async def generate_text(self, payload: Mapping[str, Any]) -> InferenceResponse:

response_json = await _post_json_with_retries(

self._ensure_client(),

f"{_google_model_path(self.model)}:generateContent",

payload,

operation="google generation",

)

if not isinstance(response_json, Mapping):

raise RuntimeError("google generation response must be a JSON object")

return _parse_google_inference_response(response_json)

add typed generate_text provider adapters

13a1e79

gemini-code-assist Bot reviewed May 21, 2026

View reviewed changes

Hynek Kydlíček (hynky1999) added 3 commits May 21, 2026 16:14

resolve media types for inference files

a3858f3

add inference response content parts

609d62a

support assistant history content parts

de0af3f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[codex] add typed generate_text provider adapters#145

[codex] add typed generate_text provider adapters#145
Hynek Kydlíček (hynky1999) wants to merge 4 commits into
mainfrom
codex/add-generate-text-provider-adapters

Hynek Kydlíček (hynky1999) commented May 21, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 21, 2026

Uh oh!

gemini-code-assist Bot May 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Hynek Kydlíček (hynky1999) commented May 21, 2026

Purpose

Changes

Validation

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant