Why Your AI Agent Costs 7x What It Should

Dev.to AI
Generative AI

Most AI agents are loops. Call the model. Read the response. Run a tool. Feed the result back. Call again. That loop is also a billing loop. Every iteration re-sends the entire prompt. And unless you've thought about it carefully, every iteration is paying full price for tokens the provider already saw three calls ago. I learned this the hard way. I had an agent that called the model five to seven times per user interaction. Screenshots were involved. A single medium-resolution image runs about two thousand tokens.