AI RESEARCH

CAM3DNet: Comprehensively mining the multi-scale features for 3D Object Detection with Multi-View Cameras

arXiv CS.CV

ArXi:2604.17024v1 Announce Type: new Query-based 3D object detection methods using multi-view images often struggle to efficiently leverage dynamic multi-scale information, e.g., the relationship between the object features and the geometric of the queries are not sufficiently learned, directly exploring the multi-scale spatiotemporal features will pay too many costs. To address these challenges, we propose CAM3DNet, a novel sparse query-based framework which combines three new modules, composite query (CQ), adaptive self-attention (ASA), and multi-scale hybrid sampling.