Mistral Unveils Open-Source Speech Recognition Model

French startup Mistral released Voxtral, its first set of audio models.

Designed for enterprises, Voxtral offers audio tools at “less than half the price” of competitors.

Voxtral transcribes audio up to 30 minutes long, and understands audio up to 40 minutes. It relies on the large language model (LLM) Mistral Small 3.1 for its reasoning capabilities. This LLM allows the model to answer questions, generate summaries, and use voice requests to carry out actions. The model automatically detects and uses multiple languages, including English, Spanish, French, Hindi, Italian, and Dutch.

Voxtral is offered in two variants, Voxtral Small and Voxtral Mini. Mistral stated that Voxtral Small matches the performance of Scribe, ElevenLabs’ voice transcription model, at less than half the price. The company added that Voxtral Mini outperforms speech recognition model OpenAI Whisper, also at half the price.

The models are offered in a 24B size for “production-scale applications” and a 3B size for “local and edge deployments”. 

The tech company offered “advanced” enterprise resources for Voxtral. This included AI integration support and industry-specific personalisation with help from Mistral’s applied AI team. The company also offered guidance in deploying Voxtral within specific infrastructure or graphics processing units (GPUs), considering data privacy requirements and cost efficiency.

Voxtral and Voxtral Mini are available through the AI model library Hugging Face. Its API starts at $0.001 per minute, according to Mistral. Voxtral can also be tested in voice mode on Le Chat, Mistral’s AI assistant. The models will be open for use on web and mobile. 

Cheaper, Open-Source Models

Increasing the affordability of AI models is a way companies try to stand out in a competitive market. Mistral referenced how historically, enterprises must choose between erroneous but cheap open-source systems, or models with advanced understanding at a steep cost.

Moonshot, the Alibaba-backed startup, recently released its own open-source LLM. This model, Moonshot shared, beats Anthropic and OpenAIs’ models on coding capabilities.

Open-source models are less common in the western tech space. This software requires no licensing fees and relies on existing data for development. The absence of licenses also allows developers to modify infrastructure more freely and at a lower cost. 

Mistral recently released Magistral, a group of AI reasoning models also offered in open-source varieties. 

Subscribe to our newsletter for updates

Join thousands of media and marketing professionals by signing up for our newsletter.

"*" indicates required fields

This field is for validation purposes and should be left unchanged.

Share

Related Posts

Popular Articles

Featured Posts

Menu