AI RESEARCH
Do Understanding and Generation Fight? A Diagnostic Study of DPO for Unified Multimodal Models
arXiv CS.LG
•
ArXi:2603.17044v1 Announce Type: new Unified multimodal models share a language model backbone for both understanding and generating images. Can DPO align both capabilities simultaneously? We present the first systematic study of this question, applying DPO to Janus-Pro at 1B and 7B parameters under seven