Autoregressive Learning in Joint KL: Sharp Oracle Bounds and Lower Bounds

ArXi:2605.12316v1 Announce Type: new We study the fundamental and timely problem of learning long sequences in autoregressive modeling and next-token prediction under model misspecification, measured by the joint Kullback--Leibler (KL) divergence. Our goal is to characterize how the sequence horizon \(H\) affects both approximation and estimation errors in this joint-distribution, sequence-level regime.