AI RESEARCH

RD-ViT: Recurrent-Depth Vision Transformer for Semantic Segmentation with Reduced Data Dependence Extending the Recurrent-Depth Transformer Architecture to Dense Prediction

arXiv CS.CV

ArXi:2605.03999v1 Announce Type: new Vision Transformers (ViTs) achieve state-of-the-art segmentation accuracy but require large