Cohere Unveils Open-Weight Speech Recognition Model, Challenging Closed-Source Giants
Cohere releases an open-weight ASR model with a 5.4% word error rate, offering enterprises a privacy-focused alternative to proprietary speech recognition APIs.

Cohere has launched an open-weight automatic speech recognition (ASR) model that achieves a 5.4% word error rate (WER) on the LibriSpeech test-clean benchmark—putting it in direct competition with closed-source offerings from Big Tech.
This move matters: For the first time, enterprises get a production-ready, high-accuracy ASR system they can fully own, deploy, and customize—without sending sensitive voice data to external clouds. The model, released in June 2024, is available under a permissive license, opening the door for broad adoption and adaptation.
Performance That Stands Up to the Big Players
Cohere’s new model clocks a 5.4% WER on LibriSpeech’s test-clean set, a widely accepted benchmark for ASR accuracy. For context, this puts Cohere’s open-weight model in the same league as closed APIs from Google, Microsoft, and Amazon—long the default choices for enterprise-grade speech recognition.
Historically, open-source or open-weight ASR models have lagged behind in accuracy, forcing organizations to trade off privacy for performance. Cohere’s release narrows that gap, offering a credible alternative without the lock-in or data residency headaches of proprietary APIs.
Why Open-Weight Matters
Open-weight models aren’t just open source—they give enterprises the actual model weights, enabling deployment on private infrastructure and fine-tuning for domain-specific needs. This is a critical distinction for sectors like healthcare, finance, and government, where compliance and data sovereignty are non-negotiable.
“Enterprises have been clamoring for high-accuracy, customizable ASR that doesn’t force them into a single vendor’s cloud,” said a Cohere spokesperson. “This release is about giving them real control.”
Industry Context: The Shift Toward Open AI
The ASR landscape has been dominated by closed APIs, with Google Speech-to-Text, Amazon Transcribe, and Microsoft Azure Speech leading the pack. These services are accurate but often raise concerns around data privacy, residency, and vendor lock-in—especially in regulated industries.
Cohere’s open-weight approach reflects a broader trend: Enterprises are demanding open, customizable AI solutions that fit their compliance and operational requirements. The move also comes as open-weight large language models (LLMs) like Llama 3 and Mistral gain traction, signaling that the appetite for open AI extends well beyond text.
Licensing and Adoption
The model is released under a permissive license, making it easier for organizations to integrate, modify, and even commercialize their own ASR-powered products. This stands in contrast to more restrictive licenses that have hampered adoption of some open-source AI models in the past.
- 5.4% WER: Matches or beats many closed-source APIs
- Open-weight: Full access to model weights for on-prem or cloud deployment
- Permissive license: Encourages broad enterprise and developer uptake
The Competitive Landscape
With this release, Cohere is signaling it wants a piece of the ASR market, which is projected to reach $10.7 billion by 2030 (Allied Market Research). The company is betting that control, privacy, and customization will trump the convenience of closed APIs—especially as regulatory scrutiny intensifies worldwide.
It’s also a shot across the bow at other open-source ASR projects, which have struggled to reach production-grade accuracy. Cohere’s model could set a new baseline for what’s possible in open-weight speech recognition.
What to Watch Next
This launch is likely to accelerate competition in enterprise ASR, forcing both closed and open providers to up their game on accuracy, transparency, and deployment flexibility. Expect to see more organizations—especially in regulated sectors—experiment with open-weight ASR as a way to regain control of their voice data and reduce vendor risk.
The bigger signal: Open-weight AI isn’t just a trend in text. Speech, vision, and other modalities are now in play, and the pressure is on for incumbents to open up or risk losing ground to more transparent, customizable alternatives.
TopWire is reader-supported.
Pro members get extended analysis and weekly deep-dives — and keep independent tech journalism running. $5/month.