ParoQuant: Pairwise Rotation Quantization for Efficient Reasoning LLM Inference
r/LocalLLaMA
•
Generative AI
Submitted by /u/Total-Resort-3120 [link] [comments]