Optimizing Large Language Models: Metrics, Energy Efficiency, and Case Study Insights

ArXi:2504.06307v2 Announce Type: replace-cross The rapid adoption of large language models (LLMs) has led to significant energy consumption and carbon emissions, posing a critical challenge to the sustainability of generative AI technologies. This paper explores the integration of energy-efficient optimization techniques in the deployment of LLMs to address these environmental concerns. We present a and framework that nstrate how strategic quantization and local inference techniques can substantially lower the carbon footprints of LLMs without compromising their operational effectiveness.