Update on my February posts about replacing RAG retrieval with NL querying — some things I've learned from actually building it

A couple of months ago I posted here ( r/LLMDevs, r/artificial ) proposing that an LLM could save its context window into a citation-grounded document and query it in plain language, replacing embedding similarity as the retrieval mechanism for reasoning recovery. Karpathy's LLM Knowledge Bases post and a recent TDS context engineering piece have since touched on similar territory, so it felt like a good time to resurface with what I've actually found building it.