How well do current models handle Icelandic audio?
I’ve been doing some informal testing on how current multimodal models handle speech + multilingual understanding, and came across an interesting behavior that feels slightly beyond standard translation.I used a short audio clip in a language I d…