Astrolabe: Steering Forward-Process Reinforcement Learning for Distilled Autoregressive Video Models

ArXi:2603.17051v1 Announce Type: new Distilled autoregressive (AR) video models enable efficient streaming generation but frequently misalign with human visual preferences. Existing reinforcement learning (RL) frameworks are not naturally suited to these architectures, typically requiring either expensive re-distillation or solver-coupled reverse-process optimization that