AI RESEARCH
Goodness-of-pronunciation without phoneme time alignment
arXiv CS.AI
•
ArXi:2603.25150v1 Announce Type: cross In speech evaluation, an Automatic Speech Recognition (ASR) model often computes time boundaries and phoneme posteriors for input features. However, limited data for ASR