attn-rot (ggerganov's "TurboQuant lite") is on the cusp of getting merged into llama.cpp

r/LocalLLaMA
Generative AI Open Source AI

Gonna delete this as soon as it's merged, just couldn't contain my excitement.