Haiku to Opus in Just 10 bits: LLMs Unlock Massive Compression Gains

ArXi:2604.02343v1 Announce Type: cross We study the compression of LLM-generated text across lossless and lossy regimes, characterizing a compression-compute frontier where compression is possible at the cost of compute. For lossless compression, domain-adapted LoRA adapters can improve LLM-based arithmetic coding by 2x over compression with the base LLM alone. For lossy compression, prompting a model for a succinct rewrite then applying arithmetic coding can achieve compression ratios of approximately 0.03, a 2x improvement over compressing the original response.