125+
Languages
Real-time
Mode
97%+
Accuracy
Diarization
Speakers
Speech, transcribed.
125+ languages. Real-time streaming.
125+ languages
Multilingual transcription with automatic language detection.
Real-time streaming
Stream audio and get results in real-time via WebSocket.
Speaker diarization
Identify and label different speakers in conversation.
Custom vocabulary
Add domain-specific terms and proper nouns.
Auto-punctuation
Automatic punctuation, capitalization, and formatting.
Batch processing
Transcribe large audio archives asynchronously.
Getting started
Launch your first instance in three steps. CLI, console, or API — your choice.
ur ai stt transcribe \
--file=meeting.mp3 \
--language=auto --diarizeSTT patterns.
Meeting transcription and call analytics.
Suggested configuration
Diarization · Real-time · 125+ lang
Estimate your costs
Create detailed configurations to see exactly how much your architecture will cost. Pay for what you use, down to the second.
Configuration 1
Speech Recognition
Usage Volume
Infrastructure
Options
Cost details
125+ languages. Speaker diarization. Custom vocabulary.
Works seamlessly with