AI RESEARCH
Fractal Autoregressive Depth Estimation with Continuous Token Diffusion
arXiv CS.CV
•
ArXi:2603.14702v1 Announce Type: new Monocular depth estimation can benefit from autoregressive (AR) generation, but direct AR modeling is hindered by the modality gap between RGB and depth, inefficient pixel-wise generation, and instability in continuous depth prediction. We propose a Fractal Visual Autoregressive Diffusion framework that reformulates depth estimation as a coarse-to-fine, next-scale autoregressive generation process.