Top Builders

Explore the top contributors showcasing the highest number of app submissions within our community.

Speechmatics API

The Speechmatics API is the company's core speech-to-text service, providing batch file transcription and real-time streaming transcription via WebSocket. Powered by the Ursa 2 model (released October 2024), it supports 55+ languages and dialects, speaker diarization, automatic translation into 30+ target languages, and a suite of Voice Intelligence add-ons. Transcription requires no model fine-tuning; custom dictionaries of up to 1,000 words take effect immediately.

General
Release dateGenerally available; Ursa 2 model released Oct 2024
DeveloperSpeechmatics
TypeCloud speech-to-text API (batch and real-time)
LicenseCommercial API
Documentationdocs.speechmatics.com/speech-to-text
GitHubspeechmatics/speechmatics-python-sdk

Core Features

  • 55+ languages and dialects: broad multilingual support including accent and dialect variants.
  • Two accuracy tiers: Enhanced (optimized for accuracy) and Standard (optimized for speed and cost).
  • Speaker diarization: multi-speaker detection included at no extra cost in all plans.
  • Custom dictionary: up to 1,000 domain-specific words added without retraining.
  • Automatic translation: transcripts translated into 30+ target languages via AI.
  • Voice Intelligence add-ons: summarization, sentiment analysis, topic detection, chapter generation, and entity recognition.
  • Audio events detection: identifies non-speech events in audio.
  • Smart formatting: formats numbers, dates, currencies, and capitalization automatically.
  • Sub-1-second real-time latency: streaming transcription via WebSocket.
  • Flexible deployment: cloud API, on-premises, on-device, Docker, and Kubernetes.

Accuracy Benchmarks (Ursa 2)

MetricResult
WER on Kincaid46 (English)7.88% (surpasses human-level on that test)
WER improvement vs. previous Ursa18% reduction across 50+ languages
FLEURS dataset leadershipLeads in 62% of supported languages
Head-to-head vs. other providersWins 88% of comparisons

Pricing

TierIncludedRate
Free480 minutes/monthNo credit card required
ProUp to 6,000 hours/monthFrom $0.24/hour (with discount)
EnterpriseUnlimited scale, no rate limitsCustom

Volume discounts apply automatically above 500 hours per month per transcription type.


Tools and Resources


Ecosystem and Integrations

  • Integrates with LiveKit, Pipecat, and Vapi for voice pipeline deployments.
  • Available on Microsoft Azure Marketplace.
  • Compatible with on-device and edge deployments via Docker or Kubernetes.
  • Medical Model variant targets clinical transcription in English, German, Danish, and Norwegian.

Start building with the free tier (no credit card required) and explore the full API via docs.speechmatics.com.

speechmatics Speechmatics api AI technology Hackathon projects

Discover innovative solutions crafted with speechmatics Speechmatics api AI technology, developed by our community members during our engaging hackathons.

MidContext Live Translation Agent

MidContext Live Translation Agent

MidContext Live Translation Agent solves a major challenge for companies operating across multilingual markets: customer support becomes slower, more expensive and less personal as customers and agents do not speak the same jargon. Beyond language, each generation have its unique way of talking and AI enables hyper customisation capabilities. We identified low scalable workflows, high wait times, low resolution quality and inconsistent customer experience as key pain points for companies, especially for companies scaling across Europe with different languages, accents and local expectations, and low maturity with their internal knowledge bases. Scalable globally, and also interesting to mayor incumbents that can not afford losses in their reputation. Our solution is a real-time voice translation layer between customer care agents and customers. The system captures voice input, converts speech through ASR, routes the conversation through a customer support layer, and generates natural voice responses using TTS. It does more than translate words: it preserves context, intent, tone and company jargon, while connecting to local knowledge bases and support workflows. It works today, right away in the company as it is, and help build its future enriching their local customer service knowledge base. The target users are multinational companies, customer operations teams, CCaaS providers and enterprises that need scalable multilingual support without losing the human connection. MidContext uses a glocal strategy: one global architecture, adapted to local languages, customer behaviors, policies and knowledge bases. A human-in-the-loop quality model keeps agents responsible for sensitive cases, approvals and escalations, reducing technological complexity while improving trust, resolution quality and customer satisfaction.

Synapse Corp AI

Synapse Corp AI

Synapse AI is an enterprise-grade multi-agent workflow automation platform designed to simulate how real organizations operate using autonomous AI agents. The platform includes specialized agents such as HR, CTO, CFO, CEO, and Risk Management agents that collaborate intelligently to perform tasks like AI-driven interviews, candidate evaluation, operational analysis, workflow automation, and executive decision-making. Unlike traditional AI assistants or single-agent chatbots, Synapse AI focuses on collaborative intelligence where multiple AI agents communicate, reason, and coordinate together to solve complex organizational workflows in real time. The system supports multimodal interactions including text, documents, reports, and speech inputs, allowing users to simulate real enterprise environments and automate time-consuming operational processes. For example, users can conduct AI-powered HR interviews, upload business reports for executive analysis, or generate strategic recommendations through coordinated AI agent discussions. Technically, the platform is built using Next.js, FastAPI, Gemini AI, Speechmatics, Supabase, Docker, and Vultr cloud infrastructure. The architecture uses scalable distributed services, asynchronous processing, and modular AI orchestration to ensure reliability, low latency, and production-style deployment readiness. Synapse AI demonstrates how autonomous AI systems can function like real organizational teams, helping businesses improve operational efficiency, reduce repetitive manual work, accelerate decision-making, and create scalable intelligent enterprise workflows for the future of AI-driven organizations.

ATRIO Boardroom

ATRIO Boardroom

**Founders and family offices decide alone.** Big calls get either delegated to a single advisor (fast, single point of failure) or convened with a committee (slow, hard to schedule, hard to audit). ATRIO Boardroom is the middle option: an AI boardroom that holds a real debate, enforces a per-tenant mandate at machine speed, and replays every decision in six months. ## Try the live demo **URL:** http://45.77.52.54:8080 (Vultr, Frankfurt) Click **Demo founder** on the sign-in screen — one click, no email — then type a boardroom question. Watch 5 specialist AI agents stream real Gemini 2.5 reasoning live, ~25 s. Go to Treasury, propose a SHV-xStock buy, try to self-second-sign (API refuses), open a new tab as **Demo CEO**, second-authorise, trade executes against Kraken paper. Download the board-pack PDF. Open the audit log. Six minutes, full lifecycle. ## The wedge - **Debate**, not consensus-on-rails. Six personas with distinct system prompts, distinct model assignments (Gemini 2.5 Flash for specialists, 2.5 Pro for Counsel), and dissent-driven turn-taking. - **Enforce**, at the API. A per-tenant `Mandate v1` (permitted instruments, daily loss limit, single-instrument max, permitted side) is the only path to a treasury action. Two-party auth cannot be bypassed by the UI. - **Audit**, by default. Every turn, vote, model call, and state transition writes to an append-only log. Exportable as JSONL + manifest. ## Why this isn't slideware - **381 / 381** backend tests pass at **90.68 %** coverage - **24 / 24** demo-video structural + **14 / 14** OCR verification - **54 / 54** pitch-deck structural + **12 / 12** OCR verification - **5 / 5** live multi-agent debate against real Gemini in ~25 s (no mocks) - **19** real bugs found and root-caused during the sprint ## Sponsors used Vultr · Google Gemini · Featherless · Speechmatics · Kraken xStocks · LiveKit. License: Apache 2.0.