datasette-lite, microsoft, mlx, prince-canuma, python, speech-to-text, uv

microsoft/VibeVoice

microsoft/VibeVoice
VibeVoice is Microsoft’s Whisper-style audio model for speech-to-text, MIT licensed and with speaker diarization built into the model.
Microsoft released it on January 21st, 2026 but I hadn’t tried it until today. Here’s a one…