top of page
  • Writer's pictureSharatkumar Chilaka

Understanding Mispronunciation Detection Systems - Part 3












(Li et al., 2017) and (Leung et al., 2019) use a hierarchical evaluation structure to evaluate the performance of the Mispronunciation detection model. Below is a diagram for the same.




The expected outcomes for mispronunciation detection are True Acceptance and True Rejection, while the unexpected outcomes are False Acceptance and False Rejection.


  • True Acceptance (TA) is the number of phonemes annotated and recognized as correct pronunciation.

  • True Rejection (TR) is the number of phonemes annotated and recognized as mispronunciation.

  • False Rejection (FR) is the number of phonemes annotated as a correct pronunciation. but recognized as a mispronunciation.

  • False Acceptance (FA) is the number of phonemes annotated as mispronunciation but recognized as correct pronunciation.

  • Correct Diagnosis (CD) is the number of phones correctly recognized as mispronunciations and correctly diagnosed as matching the annotated phonemes.


  • Diagnosis Error (DE) is the number of phones correctly recognized as mispronunciations but incorrectly diagnosed as different from the annotated phonemes.




We focus on True Rejection cases for mispronunciation diagnosis and take into account those with Diagnostic Errors. False Rejection Rate (FRR), False Acceptance Rate (FAR) and Diagnostic Error Rate (DER) can be calculated as below.




Other metrics such as Precision, Recall and F-measure are also widely used as performance measures for mispronunciation detection.




In addition, the accuracies of mispronunciation detection and mispronunciation diagnosis are calculated as follows:






















2 views0 comments

Recent Posts

See All
bottom of page