Fine-tuning Whisper for Pashto ASR: strategies and scale
arXiv:2604.06507v1 Announce Type: new
Abstract: Pashto is absent from Whisper’s pre-training corpus despite being one of CommonVoice’s largest language collections, leaving off-the-shelf models unusable: all Whisper sizes output Arabic, Dari, or Urdu …