AI RESEARCH

Benchmarking OCR Pipelines with Adaptive Enhancement for Multi-Domain Retail Bill Digitization

arXiv CS.LG

ArXi:2604.25176v1 Announce Type: cross The digitization of multi-domain retail billing documents remains a challenging task due to variability in scan quality, layout heterogeneity, and domain diversity across commercial sectors. This paper proposes and benchmarks an intelligent, quality-aware adaptive Optical Character Recognition (OCR) pipeline for retail bill digitization spanning five domains: grocery s, restaurants, hardware shops, footwear outlets, and clothing retailers.