AI RESEARCH

A Real-Calibrated Synthetic-First Data Engine

arXiv CS.LG

ArXi:2605.09699v1 Announce Type: cross Modern computer vision systems increasingly encounter performance limitations in data-scarce domains, where collecting large-scale, high-quality labeled data is costly or impractical. While controllable diffusion models enable scalable synthetic image generation, directly applying synthetic augmentation often leads to unstable performance gains due to dataset-level quality issues and insufficient feedback mechanisms.