Top Builders
Explore the top contributors showcasing the highest number of app submissions within our community.
Speechmatics API
The Speechmatics API is the company's core speech-to-text service, providing batch file transcription and real-time streaming transcription via WebSocket. Powered by the Ursa 2 model (released October 2024), it supports 55+ languages and dialects, speaker diarization, automatic translation into 30+ target languages, and a suite of Voice Intelligence add-ons. Transcription requires no model fine-tuning; custom dictionaries of up to 1,000 words take effect immediately.
| General | |
|---|---|
| Release date | Generally available; Ursa 2 model released Oct 2024 |
| Developer | Speechmatics |
| Type | Cloud speech-to-text API (batch and real-time) |
| License | Commercial API |
| Documentation | docs.speechmatics.com/speech-to-text |
| GitHub | speechmatics/speechmatics-python-sdk |
Core Features
- 55+ languages and dialects: broad multilingual support including accent and dialect variants.
- Two accuracy tiers: Enhanced (optimized for accuracy) and Standard (optimized for speed and cost).
- Speaker diarization: multi-speaker detection included at no extra cost in all plans.
- Custom dictionary: up to 1,000 domain-specific words added without retraining.
- Automatic translation: transcripts translated into 30+ target languages via AI.
- Voice Intelligence add-ons: summarization, sentiment analysis, topic detection, chapter generation, and entity recognition.
- Audio events detection: identifies non-speech events in audio.
- Smart formatting: formats numbers, dates, currencies, and capitalization automatically.
- Sub-1-second real-time latency: streaming transcription via WebSocket.
- Flexible deployment: cloud API, on-premises, on-device, Docker, and Kubernetes.
Accuracy Benchmarks (Ursa 2)
| Metric | Result |
|---|---|
| WER on Kincaid46 (English) | 7.88% (surpasses human-level on that test) |
| WER improvement vs. previous Ursa | 18% reduction across 50+ languages |
| FLEURS dataset leadership | Leads in 62% of supported languages |
| Head-to-head vs. other providers | Wins 88% of comparisons |
Pricing
| Tier | Included | Rate |
|---|---|---|
| Free | 480 minutes/month | No credit card required |
| Pro | Up to 6,000 hours/month | From $0.24/hour (with discount) |
| Enterprise | Unlimited scale, no rate limits | Custom |
Volume discounts apply automatically above 500 hours per month per transcription type.
Tools and Resources
- Batch API Reference: REST API for asynchronous file transcription jobs.
- Real-time API Reference: WebSocket API reference for streaming transcription.
- Python SDK: official SDK covering STT batch, real-time, and TTS.
- JavaScript/TypeScript SDK: official browser and Node.js SDK.
- Developer Portal: API key management and usage monitoring.
Ecosystem and Integrations
- Integrates with LiveKit, Pipecat, and Vapi for voice pipeline deployments.
- Available on Microsoft Azure Marketplace.
- Compatible with on-device and edge deployments via Docker or Kubernetes.
- Medical Model variant targets clinical transcription in English, German, Danish, and Norwegian.
Start building with the free tier (no credit card required) and explore the full API via docs.speechmatics.com.
speechmatics Speechmatics api AI technology Hackathon projects
Discover innovative solutions crafted with speechmatics Speechmatics api AI technology, developed by our community members during our engaging hackathons.




.png&w=3840&q=75)
