AI RESEARCH

Benchmarking Language Modeling for Lossless Compression of Full-Fidelity Audio

arXiv CS.LG

ArXi:2603.08683v1 Announce Type: cross Autoregressive "language" models (LMs) trained on raw waveforms can be repurposed for lossless audio compression, but prior work is limited to 8-bit audio, leaving open whether such approaches work for practical settings (16/24-bit) and can compete with existing codecs. We benchmark LM-based compression on full-fidelity audio across diverse domains (music, speech, bioacoustics), sampling rates (16kHz-48kHz), and bit depths (8, 16, 24-bit