The KV Cache. Every LLM Running Today Is Built Around One Number Staying Still.

Towards AI
Generative AI

What the K and V Matrices Look Like at Token 1, Token 2, Token 3. Until Now. With the Arithmetic.