Reward-Forcing: Autoregressive Video Generation with Reward Feedback

ArXi:2601.16933v2 Announce Type: replace-cross While most prior work in video generation relies on bidirectional architectures, recent efforts have sought to adapt these models into autoregressive variants to near real-time generation. However, such adaptations often depend heavily on teacher models, which can limit performance, particularly in the absence of a strong autoregressive teacher, resulting in output quality that typically lags behind their bidirectional counterparts.