The $0 Problem: Why Every Tool Says Your On-Prem Inference is Free

If you run LLMs on your own hardware, every cost tracking tool in the ecosystem has the same answer for what it costs: $0. OpenCost sees your GPU pods but has no concept of tokens. LiteLLM tracks tokens per user but hardcodes on-prem cost to zero. Langfuse traces requests but only prices cloud APIs. The FinOps Foundation's own working group explicitly says on-premises AI cost is "outside the scope." Meanwhile, your GPUs cost real money. The H100s draw 700 watts each. Your electricity bill is real. The three-year amortization on $280K of hardware is real.