Support Realtime custom voice objects by lionel-oai · Pull Request #3473 · openai/openai-agents-python

lionel-oai · 2026-05-20T15:51:17Z

Summary

This PR fixes Realtime custom voice handling in the Agents SDK.

Realtime sessions can receive and send structured custom voice objects such as {"id": "voice_..."}, but the SDK previously typed voice settings as strings and validated inbound server events before updating response lifecycle state. If a server event such as response.created or response.done contained a structured voice object that failed validation, the SDK could skip response state updates and leave the response-create sequencer blocked. That could prevent the next response.create from being sent after tool output.

The change adds typed support for custom voice objects in Realtime session settings, preserves structured voices when building outbound session.update payloads, and adds a validation fallback for inbound server events so custom voice objects do not break response lifecycle tracking.

Tests

make format
make lint
uv run pytest -q tests/realtime/test_openai_realtime.py tests/realtime/test_realtime_model_settings.py
uv run pytest -q tests/realtime/test_session.py -k "handoff_session_update_preserves_custom_voice or handoff_tool_handling"
uv run mypy src/agents/realtime/config.py src/agents/realtime/openai_realtime.py tests/realtime/test_openai_realtime.py
uv run pyright src/agents/realtime/config.py src/agents/realtime/openai_realtime.py tests/realtime/test_openai_realtime.py
uv run mypy tests/realtime/test_session.py
uv run pyright tests/realtime/test_session.py

Full make tests / make typecheck were not completed locally because optional dependency installation was blocked by a socket-firewall tunnel failure while downloading docstring-parser==0.18.0.

seratch · 2026-05-20T23:25:12Z

+    return normalized
+
+
+def _create_realtime_audio_output(audio_output_args: dict[str, Any]) -> Any:


If we upgrade openai package to openai>=2.36.0 , this workaround is not necessary while _normalize_custom_voice_for_server_event_validation is still required even with the latest version.

Can you add quick TODO comments explaining why and when to remove to these internal workarounds?

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 393c53087a

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-05-22T15:41:38Z

            if "previous_item_id" in event and event["previous_item_id"] is None:
                event["previous_item_id"] = ""  # TODO (rm) remove
-            parsed: AllRealtimeServerEvents = self._server_event_type_adapter.validate_python(event)
+            validation_event = _normalize_custom_voice_for_server_event_validation(event)


Limit voice normalization to events that can contain voice objects

_normalize_custom_voice_for_server_event_validation is applied to every inbound WebSocket event before validation, including high-frequency streaming events like response.output_audio.delta. In long audio turns this adds an extra full recursive walk/allocation per event even when no voice field exists, which can unnecessarily increase CPU/GC pressure and degrade realtime playback latency. Since the workaround is only needed for server events carrying session/response voice settings, scope it to those event types (or fast-path when no voice key is present).

Useful? React with 👍 / 👎.

lionel-oai force-pushed the fix/realtime-custom-voice branch 2 times, most recently from eed10dc to 20e7135 Compare May 20, 2026 18:42

seratch added the feature:realtime label May 20, 2026

seratch requested changes May 20, 2026

View reviewed changes

seratch added this to the 0.17.x milestone May 20, 2026

Support Realtime custom voice objects

393c530

lionel-oai force-pushed the fix/realtime-custom-voice branch from 20e7135 to 393c530 Compare May 22, 2026 15:37

chatgpt-codex-connector Bot reviewed May 22, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support Realtime custom voice objects#3473

Support Realtime custom voice objects#3473
lionel-oai wants to merge 1 commit into
mainfrom
fix/realtime-custom-voice

lionel-oai commented May 20, 2026 •

edited

Loading

Uh oh!

seratch May 20, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		return normalized


		def _create_realtime_audio_output(audio_output_args: dict[str, Any]) -> Any:

Conversation

lionel-oai commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Tests

Uh oh!

seratch May 20, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 22, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

lionel-oai commented May 20, 2026 •

edited

Loading