Claude Code cache chaos creates quota complaints (3 minute read)
TLDR AI
•
Generative AI
Claude prompt caching helps users avoid reprocessing previously used prompts to save on tokens. Caches can either have a five-minute or one-hour time to live. Writing to the five-minute cache costs 25% in tokens, and writing to the one-hour cache 100% more, but reading from cache is around 10% of the base price. Some users are reporting issues with hitting limits too quickly, and there are reports that Claude's performance has dropped.