Geometry-Aware Decoding with Wasserstein-Regularized Truncation and Mass Penalties for Large Language Models

ArXi:2602.10346v2 Announce Type: replace-cross Large language models (LLMs) must balance diversity and creativity against logical coherence in open-ended generation. Existing truncation-based samplers are effective but largely heuristic, relying mainly on probability mass and entropy while ignoring semantic geometry of the token space.