Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion api-reference/v2/pre-recorded/init.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,8 @@
title: Initiate a transcription
description: Initiate a pre-recorded transcription job. Use the returned `id` and the [GET /v2/pre-recorded/:id](/api-reference/v2/pre-recorded/get) endpoint to obtain the results.
openapi: POST /v2/pre-recorded
---
---

import ChooseModel from "/snippets/choose-model.mdx";

<ChooseModel />
4 changes: 4 additions & 0 deletions chapters/live-stt/quickstart.mdx
Original file line number Diff line number Diff line change
@@ -1,12 +1,16 @@
---
title: Quickstart
description: How to transcribe live audio with Gladia's Real-time speech-to-text (STT) API

Check warning on line 3 in chapters/live-stt/quickstart.mdx

View check run for this annotation

Mintlify / Mintlify Validation (gladia-95) - vale-spellcheck

chapters/live-stt/quickstart.mdx#L3

Did you really mean 'Gladia's'?
---

import Samples from '/snippets/samples.mdx';
import PartialsTip from '/snippets/partials-tip.mdx';
import WhyPostToOpenWebSocket from '/snippets/why-post-to-open-websocket.mdx';

<Info>
Live transcription supports **`"solaria-1"` only**.
</Info>

<Tabs>
<Tab title="Using our SDKs" icon="code">

Expand Down Expand Up @@ -175,7 +179,7 @@
</CodeGroup>

<Note>
A single realtime transcription session cannot exceed **3 hours**. For longer events, start a new session before reaching the limit. See [Concurrency and rate limits](/chapters/limits-and-specifications/concurrency) and [Supported files & duration](/chapters/limits-and-specifications/supported-formats) for details.

Check warning on line 182 in chapters/live-stt/quickstart.mdx

View check run for this annotation

Mintlify / Mintlify Validation (gladia-95) - vale-spellcheck

chapters/live-stt/quickstart.mdx#L182

Did you really mean 'realtime'?
</Note>

## Read messages
Expand Down Expand Up @@ -403,7 +407,7 @@
</CodeGroup>

<Note>
A single realtime transcription session cannot exceed **3 hours**. For longer events, start a new session before reaching the limit. See [Concurrency and rate limits](/chapters/limits-and-specifications/concurrency) and [Supported files & duration](/chapters/limits-and-specifications/supported-formats) for details.

Check warning on line 410 in chapters/live-stt/quickstart.mdx

View check run for this annotation

Mintlify / Mintlify Validation (gladia-95) - vale-spellcheck

chapters/live-stt/quickstart.mdx#L410

Did you really mean 'realtime'?
</Note>

## Read messages
Expand Down Expand Up @@ -476,7 +480,7 @@

#### Example use case

Consider a scenario with three participants in a room: Sami, Maxime, and Mark. Instead of opening three separate WebSocket connections (one for each participant), you can merge their audio tracks and send them over a single WebSocket:

Check warning on line 483 in chapters/live-stt/quickstart.mdx

View check run for this annotation

Mintlify / Mintlify Validation (gladia-95) - vale-spellcheck

chapters/live-stt/quickstart.mdx#L483

Did you really mean 'Sami'?

Check warning on line 483 in chapters/live-stt/quickstart.mdx

View check run for this annotation

Mintlify / Mintlify Validation (gladia-95) - vale-spellcheck

chapters/live-stt/quickstart.mdx#L483

Did you really mean 'Maxime'?

1. Collect audio buffers from each participant
2. Merge them into a single multi-channel audio stream using the `interleaveAudio` function
Expand Down Expand Up @@ -524,7 +528,7 @@

#### Understanding the response

When you send a multi-channel audio stream to Gladia, the channel order is preserved in the transcription results. Each transcription message will include a `channel` field that indicates which audio channel (and thus which participant) the transcription belongs to:

Check warning on line 531 in chapters/live-stt/quickstart.mdx

View check run for this annotation

Mintlify / Mintlify Validation (gladia-95) - vale-spellcheck

chapters/live-stt/quickstart.mdx#L531

Did you really mean 'Gladia'?

```json
{
Expand Down Expand Up @@ -582,8 +586,8 @@

The channel numbers directly correspond to the order in which you added the audio tracks to the `channelsData` array:

- Channel 0 → Sami (first in the array)

Check warning on line 589 in chapters/live-stt/quickstart.mdx

View check run for this annotation

Mintlify / Mintlify Validation (gladia-95) - vale-spellcheck

chapters/live-stt/quickstart.mdx#L589

Did you really mean 'Sami'?
- Channel 1 → Maxime (second in the array)

Check warning on line 590 in chapters/live-stt/quickstart.mdx

View check run for this annotation

Mintlify / Mintlify Validation (gladia-95) - vale-spellcheck

chapters/live-stt/quickstart.mdx#L590

Did you really mean 'Maxime'?
- Channel 2 → Mark (third in the array)

<Warning>
Expand Down
33 changes: 21 additions & 12 deletions chapters/pre-recorded-stt/quickstart.mdx
Original file line number Diff line number Diff line change
@@ -1,10 +1,11 @@
---
title: Quickstart
description: How to transcribe pre-recorded audio with Gladia's speech-to-text (STT) API

Check warning on line 3 in chapters/pre-recorded-stt/quickstart.mdx

View check run for this annotation

Mintlify / Mintlify Validation (gladia-95) - vale-spellcheck

chapters/pre-recorded-stt/quickstart.mdx#L3

Did you really mean 'Gladia's'?
---

import GetTranscriptionResult from "/snippets/get-transcription-result.mdx";
import Samples from "/snippets/samples.mdx";
import ChooseModel from "/snippets/choose-model.mdx";

<Tabs>
<Tab title="Using our SDKs" icon="code">
Expand All @@ -13,6 +14,8 @@
- A `transcribe()` for an end-to-end flow
- Individual steps when you need control over each step.

<ChooseModel />

## Install the SDK

<CodeGroup>
Expand Down Expand Up @@ -73,9 +76,9 @@
const transcription = await gladiaClient.preRecorded().transcribe(
"YOUR_AUDIO_URL_OR_LOCAL_PATH",
{
model: "solaria-3",
Comment thread
coderabbitai[bot] marked this conversation as resolved.
language_config: {
languages: ["en", "fr"],
code_switching: true,
languages: ["fr"],
},
custom_vocabulary: true,
custom_vocabulary_config: {
Expand All @@ -93,9 +96,9 @@
transcription = gladia_client.transcribe(
"YOUR_AUDIO_URL_OR_LOCAL_PATH",
{
"model": "solaria-3",
"language_config": {
"languages": ["en", "fr"],
"code_switching": True,
"languages": ["fr"],
},
"custom_vocabulary": True,
"custom_vocabulary_config": {
Expand All @@ -106,6 +109,10 @@
```
</CodeGroup>

<Note>
With `"solaria-3"`, set **one language** in `language_config.languages` — for example `["fr"]`. Do not pass multiple languages or enable code switching.
</Note>

<Note>
Want to go further? See [Audio Intelligence](/chapters/pre-recorded-stt/audio-intelligence) for add-ons like:
- [Speaker diarization](/chapters/audio-intelligence/speaker-diarization): separate the speakers across the conversation
Expand Down Expand Up @@ -174,9 +181,9 @@

const job = await gladiaClient.preRecorded().createUntyped({
audio_url: "YOUR_AUDIO_URL",
model: "solaria-3",
language_config: {
languages: ["en", "fr"],
code_switching: true,
languages: ["fr"],
},
custom_vocabulary: true,
custom_vocabulary_config: {
Expand All @@ -193,9 +200,9 @@
job = gladia_client.create(
{
"audio_url": "YOUR_AUDIO_URL",
"model": "solaria-3",
"language_config": {
"languages": ["en", "fr"],
"code_switching": True,
"languages": ["fr"],
},
"custom_vocabulary": True,
"custom_vocabulary_config": {
Expand All @@ -219,6 +226,8 @@
</Tab>
<Tab title="Using the API" icon="brackets-curly">

<ChooseModel />

## Individual steps

Upload audio, create a transcription job, then poll until the job is done (or use webhooks or a callback URL).
Expand Down Expand Up @@ -270,9 +279,9 @@
},
body: JSON.stringify({
audio_url: "YOUR_AUDIO_URL",
model: "solaria-3",
language_config: {
languages: [],
code_switching: false,
languages: ["fr"],
},
diarization: true,
diarization_config: {
Expand Down Expand Up @@ -311,9 +320,9 @@
--header 'x-gladia-key: YOUR_GLADIA_API_KEY' \
--data '{
"audio_url": "YOUR_AUDIO_URL",
"model": "solaria-3",
"language_config": {
"languages": [],
"code_switching": false
"languages": ["fr"]
},
"diarization": true,
"diarization_config": {
Expand Down
16 changes: 16 additions & 0 deletions snippets/choose-model.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
<Tip>
Pass `model` to choose the transcription model:

**`"solaria-3"`** — our latest model: highest accuracy on European real-world audio.
- **Async (pre-recorded) only** — not available for live transcription.
- **Languages:** English, French, German, Spanish, Italian
- **Single language only** — pass exactly one language in `language_config.languages` (no code switching).
- All Audio Intelligence add-ons available.

**`"solaria-1"`** — our generalist model: maximum language coverage across any domain.
- Available for async and live.
- Code switching and multi-language configuration (100+ languages covered)
- All Audio Intelligence add-ons available.

If omitted, the API uses the default model. (Solaria-1)
</Tip>