How do you track and optimize your AI API spend?

Running several projects that collectively hit $2k+/mo in API costs across OpenAI, Anthropic,& AWS Bedrock. Started doing monthly audits then found I was overspending by about 60%. Model routing cut costs 55% with no quality loss on final output Prompt compression saved 70% on my most called endpoint Request deduplication on retries eliminated 15% of wasted calls Caching semantically similar queries knocked out another 20-30% But I feel like I'm still missing things, especially on the infrastructure side (GPU instance sizing, spot vs. on-demand, etc.