Speech Engine SDK#771
Merged
Merged
Conversation
3 tasks
kraenhansen
reviewed
Apr 28, 2026
kraenhansen
reviewed
Apr 28, 2026
kraenhansen
approved these changes
Apr 30, 2026
Member
kraenhansen
left a comment
There was a problem hiding this comment.
I have a few more comments 🙈 Good to merge as-is 👍
Rename client_options parameter to client_wrapper for consistency with Fern SDK conventions. Use get_headers() instead of accessing private _api_key attribute. Update tests to match. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Use a dedicated _in_transcript_handler flag instead of checking _current_event_id is None, which conflated "no handler running" with "handler running but transcript had no event_id". Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Move the log call after the try/except and use `or []` to handle null transcript_data, preventing len(None) TypeError. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The server uses websockets 13.x API (websocket.request.path, websocket.request.headers) which doesn't exist in 11.x-12.x. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Without this, the run() loop continues calling recv() after a close message, processing stale messages and reporting is_open as True. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Use websockets process_request callback to verify JWT before the WebSocket handshake completes. Unauthenticated requests now get a plain HTTP 401 instead of completing the upgrade first. Also include exception in logger.exception error handler message. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
json.loads can return lists, strings, numbers, or null. These would crash _handle_message which expects a dict. Now emits an error event and continues instead. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The string response path called _send_agent_response twice without an explicit event_id, reading self._current_event_id at each call. An interruption between the two awaits could stamp the terminator with the wrong event_id. Now captures event_id once and passes it through, matching the stream path. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace the stub _AsyncSpeechEngineAccessor with custom SpeechEngineClient/AsyncSpeechEngineClient classes that extend the Fern-generated clients and add resource() for WebSocket server setup. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
6ee7e37 to
7cf57a1
Compare
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
There are 2 total unresolved issues (including 1 from previous review).
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit ebb5053. Configure here.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Note
Medium Risk
Adds a new async WebSocket server/session layer plus JWT verification for inbound Speech Engine connections and bumps the
websocketsdependency, so runtime behavior and compatibility may change even though changes are largely additive.Overview
Introduces a new
speech_enginemodule enabling server-side voice agents: aSpeechEngineServer(standalone WebSocket server) andSpeechEngineSession(event-emitter style session that streamsagent_responsemessages, supports interruption/cancellation, and auto-parses common LLM streaming formats).Extends the generated Speech Engine clients with
SpeechEngineResourcewrappers returned fromcreate/get/update, addsspeech_engineaccessors toElevenLabs/AsyncElevenLabs, and documents usage inREADME.md.Adds HS256 JWT request verification for inbound connections (used by
SpeechEngineServerand exposed viaSpeechEngineResource.verify_request), comprehensive new tests for auth/session/server behavior, updates.fernignoreto protect the new module, and bumpswebsocketsto>=13.0.Reviewed by Cursor Bugbot for commit dc02e44. Bugbot is set up for automated code reviews on this repo. Configure here.