Through a Compressed Lens: Investigating The Impact of Quantization on Factual Knowledge Recall

ArXi:2505.13963v3 Announce Type: replace-cross Quantization methods are widely used to accelerate inference and streamline the deployment of large language models (LLMs). Although quantization's effects on various LLM capabilities have been extensively studied, one critical area remains underexplored: factual knowledge recall (FKR), the process by which LLMs access d knowledge.