AI RESEARCH

CDPR: Cross-modal Diffusion with Polarization for Reliable Monocular Depth Estimation

arXiv CS.CV

ArXi:2604.11097v1 Announce Type: new Monocular depth estimation is a fundamental yet challenging task in computer vision, especially under complex conditions such as textureless surfaces, transparency, and specular reflections. Recent diffusion-based approaches have significantly advanced performance by reformulating depth prediction as a denoising process in the latent space. However, existing methods rely solely on RGB inputs, which often lack sufficient cues in challenging regions.