fix(stt): STT compatibility fixes for Groq Whisper and AionUI web frontend#400
Open
starm2010 wants to merge 1 commit into
Open
fix(stt): STT compatibility fixes for Groq Whisper and AionUI web frontend#400starm2010 wants to merge 1 commit into
starm2010 wants to merge 1 commit into
Conversation
…ontend
1. Key mismatch: get_preferences now queries both 'speechToText' and
'tools.speechToText' with fallback for backward compatibility
2. Multipart field: accept both 'file' and 'audio' field names, parse
filename from Content-Disposition header
3. Double /v1 URL: trim_end_matches('/v1') on base_url before appending
'/v1/audio/transcriptions' to avoid double /v1/v1/
4. Language normalization: strip region codes (en-US → en) for Groq
Whisper which only accepts ISO 639-1 base codes
5. MIME normalization: strip codec params (audio/webm;codecs=opus →
audio/webm) before passing to reqwest mime_str()
6. User language priority: config.language now overrides browser
languageHint so users can set transcription language in UI settings
Also removes MaskedApiKey error variant (ACP masking is by design,
not an error).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
When using AionUI web frontend with Groq Whisper as the STT provider, speech-to-text fails with 400/502 errors. There are four distinct issues:
Preference key mismatch: The AionUI web frontend stores STT config under the key
tools.speechToTextinclient_preferences, but the backend queries onlyspeechToText. Sinceget_preferences()filters withWHERE key IN (...), the config row is never found, causingSTT_DISABLEDeven when the user has configured STT.Multipart field name mismatch: The web frontend sends audio as FormData field
audio(viaformData.append('audio', blob, filename)), but the backend expects field namefile. This causes a400 Bad Request: missing 'file' fielderror. The frontend also embeds the filename in the blob's Content-Disposition header rather than as a separate multipart field.Double
/v1in Groq URL: Whenbase_urlincludes/v1(e.g.https://api.groq.com/openai/v1), the code appends/v1/audio/transcriptions, producinghttps://api.groq.com/openai/v1/v1/audio/transcriptions→502 Bad Gateway.Language code and MIME type incompatibility: The browser sends
languageHint: "en-US"but Groq Whisper only accepts ISO 639-1 base codes (e.g.en). Similarly, the browser sendsaudio/webm;codecs=opusas the MIME type, butreqwest::mime_str()requires clean MIME types without codec parameters.Solution
Key mismatch: Query both
speechToTextandtools.speechToTextinget_preferences(). Useprefs.get("tools.speechToText").or_else(|| prefs.get("speechToText"))for backward compatibility.Multipart field: Accept both
"file"and"audio"field names. Parse filename from the Content-Disposition header when the frontend sends it as part of the blob rather than a separate field.Double
/v1: Add.trim_end_matches("/v1")after.trim_end_matches('/')when constructing the base URL, before appending/v1/audio/transcriptions.Language normalization: Strip region codes via
lang.split('-').next()(e.g.en-US→en). User-configured language in settings now takes precedence over browserlanguageHint, so users can override the browser locale for transcription.MIME normalization: Strip codec parameters via
mime_type.split(';').next().trim()(e.g.audio/webm;codecs=opus→audio/webm).Testing
curltests directly against aioncore/api/sttendpoint return 200 OKlanguage: "es"in STT settings correctly overrides browser localeChecklist
speechToText) and new (tools.speechToText) keys work"file"and"audio"multipart field names acceptedFixes #373