LocalLLaMA

I benchmarked 42 STT models on medical audio with a new Medical WER metric — the leaderboard completely reshuffled

TL;DR: I updated my medical speech-to-text benchmark to 42 models (up from 31 in v3) and added a new metric: Medical WER (M-WER). Standard WER treats every word equally. In medical audio, that makes little sense — “yeah” and “amoxicillin” do not …