AI RESEARCH

Fractal Autoregressive Depth Estimation with Continuous Token Diffusion

arXiv CS.CV

ArXi:2603.14702v1 Announce Type: new Monocular depth estimation can benefit from autoregressive (AR) generation, but direct AR modeling is hindered by the modality gap between RGB and depth, inefficient pixel-wise generation, and instability in continuous depth prediction. We propose a Fractal Visual Autoregressive Diffusion framework that reformulates depth estimation as a coarse-to-fine, next-scale autoregressive generation process.