Claude Code cache chaos creates quota complaints (3 minute read)

TLDR AI
Generative AI

Claude prompt caching helps users avoid reprocessing previously used prompts to save on tokens. Caches can either have a five-minute or one-hour time to live. Writing to the five-minute cache costs 25% in tokens, and writing to the one-hour cache 100% more, but reading from cache is around 10% of the base price. Some users are reporting issues with hitting limits too quickly, and there are reports that Claude's performance has dropped.