RUQuant: Towards Refining Uniform Quantization for Large Language Models

ArXi:2604.04013v1 Announce Type: new The increasing size and complexity of large language models (LLMs) have raised significant challenges in deployment efficiency, particularly under resource constraints. Post-