Boosting Reasoning in Large Multimodal Models via Activation Replay

ArXi:2511.19972v3 Announce Type: replace Recently, Reinforcement Learning with Verifiable Rewards (RLVR) has emerged as an effective approach to incentivizing reasoning capability in Large Multimodal Models (LMMs), while the underlying mechanisms behind this post-