AI RESEARCH

Identifiable Token Correspondence for World Models

arXiv CS.LG

ArXi:2605.16457v1 Announce Type: new Transformer-based world models have shown strong performance in visual reinforcement learning, but often suffer from temporal inconsistency in long-horizon rollouts, including object duplication, disappearance, and transmutation. A key reason is that most existing approaches treat next-frame prediction purely as a token generation problem, without explicitly modeling correspondence between tokens across time.