AI RESEARCH
Learning Relative Representations for Fine-Grained Multimodal Alignment with Limited Data
arXiv CS.LG
•
ArXi:2605.16834v1 Announce Type: cross Multimodal pre-