Skip to main content

Realtime Transcript

Stream live meeting transcripts with speaker identification as the meeting happens.

Overview

When realtimeTranscript is enabled on a bot, the bot streams audio to Gladia's Live API during the meeting and delivers labeled transcript chunks in realtime via WebSocket. Speaker names are automatically matched to transcript segments using the bot's participant detection.

Enabling Realtime Transcript

Set realtimeTranscript: true when creating a bot:

curl -X POST https://botapi.syntrimeet.com/api/v1/bots \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"botDisplayName": "SyntriBot",
"meetingInfo": {
"platform": "google",
"meetingUrl": "https://meet.google.com/abc-defg-hij"
},
"realtimeTranscript": true
}'

Subscribing to Live Transcript

Connect to the WebSocket endpoint to receive transcript chunks in realtime.

Endpoint

wss://botapi.syntrimeet.com/api/v1/bots/{botId}/transcript/live?token=JWT_TOKEN

Authentication

MethodHow
JWT TokenPass as query parameter: ?token=eyJ...
API KeySend in header: X-API-Key: sk_...

JavaScript Example

const botId = 134;
const token = "your-jwt-token";

const ws = new WebSocket(
`wss://botapi.syntrimeet.com/api/v1/bots/${botId}/transcript/live?token=${token}`
);

ws.onmessage = (event) => {
const message = JSON.parse(event.data);

switch (message.type) {
case "transcript":
console.log(
`[${message.data.speaker.name || 'Speaker ' + message.data.speaker.id}]: ${message.data.transcript}`
);
break;

case "speaker_update":
console.log("Speaker mappings updated:", message.data.mappings);
break;

case "stream_ended":
console.log("Meeting ended:", message.data.reason);
ws.close();
break;
}
};

ws.onclose = () => console.log("Disconnected");
ws.onerror = (err) => console.error("WebSocket error:", err);

Python Example

import asyncio
import json
import websockets

async def listen_transcript(bot_id: int, token: str):
uri = f"wss://botapi.syntrimeet.com/api/v1/bots/{bot_id}/transcript/live?token={token}"

async with websockets.connect(uri) as ws:
async for message in ws:
data = json.loads(message)

if data["type"] == "transcript":
chunk = data["data"]
speaker = chunk["speaker"]["name"] or f"Speaker {chunk['speaker']['id']}"
print(f"[{speaker}]: {chunk['transcript']}")

elif data["type"] == "stream_ended":
print(f"Stream ended: {data['data']['reason']}")
break

asyncio.run(listen_transcript(bot_id=134, token="your-jwt-token"))

Message Types

transcript

Delivered for each transcript segment. Includes both interim (partial) and final results.

{
"type": "transcript",
"data": {
"transcript": "Hello everyone, welcome to the meeting",
"speaker": {
"id": 0,
"name": "Shashank Shahare",
"confidence": 0.85
},
"isFinal": true,
"timestamp": 12.5,
"words": [
{ "word": "Hello", "start": 12.5, "end": 12.8, "confidence": 0.98 },
{ "word": "everyone", "start": 12.9, "end": 13.3, "confidence": 0.95 }
]
}
}
FieldTypeDescription
transcriptstringThe transcribed text
speaker.idnumberSpeaker ID (0, 1, 2...)
speaker.namestring | nullMatched participant name, or null if unknown
speaker.confidencenumberConfidence of speaker name match (0.0 - 1.0)
isFinalbooleantrue for final results, false for interim/partial
timestampnumberTimestamp in seconds from start of recording
wordsarrayWord-level timing and confidence (optional)

speaker_update

Sent when speaker name mappings change.

{
"type": "speaker_update",
"data": {
"mappings": {
"0": { "name": "Shashank Shahare", "confidence": 0.85 },
"1": { "name": "Lucky", "confidence": 0.72 }
}
}
}

stream_ended

Sent when the bot leaves the meeting. The WebSocket connection will close after this message.

{
"type": "stream_ended",
"data": {
"reason": "meeting_ended"
}
}

Latency

ComponentLatency
Audio capture to Gladia~100ms
Gladia processing~200-300ms
Server relay to client~50ms
Total end-to-end~300-500ms

Limitations

  • No replay: If you connect after the meeting starts, you only receive transcript from that point forward
  • No persistence: Realtime chunks are not stored. Use the post-meeting transcript API for the full record
  • Cost: Gladia Live API charges per audio-second streamed. Only enabled when realtimeTranscript: true
  • Speaker names: Accuracy improves over the first 30-60 seconds as the confidence scoring stabilizes

Best Practices

  1. Handle interim results: Filter on isFinal: true if you only want complete sentences
  2. Use speaker updates: Listen for speaker_update events to retroactively relabel earlier interim results
  3. Reconnect gracefully: If the WebSocket disconnects, reconnect and you'll resume receiving new chunks
  4. Close on stream_ended: When you receive stream_ended, the meeting is over — close your connection