AI RESEARCH

Rethinking Two-Stage Referring-by-Tracking in Referring Multi-Object Tracking: Make it Strong Again

arXiv CS.CV

ArXi:2503.07516v5 Announce Type: replace Referring Multi-Object Tracking (RMOT) aims to track multiple objects specified by natural language expressions in videos. With the recent significant progress of one-stage methods, the two-stage Referring-by-Tracking (RBT) paradigm has gradually lost its popularity. However, its lower