AI RESEARCH

EnergyLens: Predictive Energy-Aware Exploration for Multi-GPU LLM Inference Optimization

arXiv CS.LG

ArXi:2605.14249v1 Announce Type: new We present EnergyLens, an end-to-end framework for energy-aware large language model (LLM) inference optimization. As LLMs scale, predicting and reducing their energy footprint has become critical for sustainability and datacenter operations, yet existing approaches either require production-level code and expensive profiling or fail to accurately capture multi-GPU energy behavior.