Microsoft VibeVoice: Open-Source Frontier Voice AI

Apr 28, 2026 · 11:56 AM UTC ·4 min read · 0 reactions · 0 comments · 10 views

#technology #ai #open-source #speech-recognition #text-to-speech

Microsoft VibeVoice: Open-Source Frontier Voice AI

⚡ TL;DR · AI summary

Microsoft has introduced VibeVoice, an open-source voice AI framework that includes both speech recognition and text-to-speech models. The VibeVoice-ASR model can process long-form audio and generate structured transcriptions, while the VibeVoice-TTS model supports multi-speaker dialogues. Both models are designed to enhance collaboration in the speech synthesis community and are now available through the Hugging Face Transformers library.

Key facts

▪VibeVoice-ASR is a unified speech-to-text model capable of handling 60-minute long-form audio in a single pass.
▪The VibeVoice-TTS model can synthesize speech for up to 90 minutes with support for multiple speakers.
▪VibeVoice employs innovative continuous speech tokenizers to improve audio fidelity and computational efficiency.

Original article

GitHub

Read full at GitHub →

Opening excerpt (first ~120 words) tap to expand

🎙️ VibeVoice: Open-Source Frontier Voice AI 📰 News 2026-03-06: 🚀 VibeVoice ASR is now part of a Transformers release! You can now use our speech recognition model directly through the Hugging Face Transformers library for seamless integration into your projects. 2026-01-21: 📣 We open-sourced VibeVoice-ASR, a unified speech-to-text model designed to handle 60-minute long-form audio in a single pass, generating structured transcriptions containing Who (Speaker), When (Timestamps), and What (Content), with support for User-Customized Context. Try it in Playground. ⭐️ VibeVoice-ASR is natively multilingual, supporting over 50 languages — check the supported languages for details.

…

Excerpt limited to ~120 words for fair-use compliance. The full article is at GitHub.

Anonymous · no account needed

Discussion

0 comments

Microsoft VibeVoice: Open-Source Frontier Voice AI

Discussion

More from GitHub