PoseGen: In-Context LoRA Finetuning for Pose-Controllable Long Human Video Generation

ArXi:2508.05091v2 Announce Type: replace Generating temporally coherent, long-duration videos with precise control over subject identity and movement remains a fundamental challenge for contemporary diffusion-based models, which often suffer from identity drift and are limited to short video length. We present PoseGen, a novel framework that generates human videos of extended duration from a single reference image and a driving video.