ParoQuant: Pairwise Rotation Quantization for Efficient Reasoning LLM Inference

r/LocalLLaMA
Generative AI

Submitted by /u/Total-Resort-3120 [link] [comments]