Jeremy H. M. Wong, Nancy F. Chen

Goodness-of-pronunciation without phoneme time alignment

Jeremy H. M. Wong, Nancy F. Chen / March 27, 2026

arXiv:2603.25150v1 Announce Type: new
Abstract: In speech evaluation, an Automatic Speech Recognition (ASR) model often computes time boundaries and phoneme posteriors for input features. However, limited data for ASR training hinders expansion of spe…

Author name: Jeremy H. M. Wong, Nancy F. Chen

Goodness-of-pronunciation without phoneme time alignment