AI RESEARCH
Pyramidal Patchification Flow for Visual Generation
arXiv CS.CV
•
ArXi:2506.23543v3 Announce Type: replace Diffusion transformers (DiTs) adopt Patchify, mapping patch representations to token representations through linear projections, to adjust the number of tokens input to DiT blocks and thus the computation cost. Instead of a single patch size for all the timesteps, we