AI RESEARCH
The Character Error Vector: Decomposable errors for page-level OCR evaluation
arXiv CS.LG
•
ArXi:2604.06160v1 Announce Type: cross The Character Error Rate (CER) is a key metric for evaluating the quality of Optical Character Recognition (OCR). However, this metric assumes that text has been perfectly parsed, which is often not the case. Under page-parsing errors, CER becomes undefined, limiting its use as a metric and making evaluating page-level OCR challenging, particularly when using data that do not share a labelling schema. We