CR-QAT: Curriculum Relational Quantization-Aware Training for Open-Vocabulary Object Detection

ArXi:2603.05964v1 Announce Type: new Open-vocabulary object detection (OVOD) enables novel category detection via vision-language alignment, but massive model sizes hinder deployment on resource-constrained devices. While quantization offers practical compression, we reveal that naive extreme low-bit (e.g., 4-bit) quantization severely degrades fine-grained vision-language alignment and distorts inter-region relational structures. To address this, we propose curriculum relational quantization-aware.