Hyperspherical Autoencoder for High-Fidelity Image Reconstruction and Generation

ArXi:2601.22904v2 Announce Type: replace-cross Recent studies have explored using pretrained Vision Foundation Models (VFMs) such as DINO for generative autoencoders, showing strong generative performance. Unfortunately, existing approaches often suffer from limited reconstruction fidelity due to the loss of high-frequency details. In this work, we present the \textbf{\em Hyperspherical Autoencoder (HAE)}, a framework that bridges semantic representation and pixel-level reconstruction.