I built an MCP server that gives AI persistent memory of your SQL database

A while ago I tried to build a local coding assistant. I downloaded Qwen3, fired it up on my MacBook with 16GB of RAM, and within a day realized the output quality was nowhere close to Claude or GPT-5. The model could fit. It just couldn't compete. So I changed the question. If I can't make the model smarter on my hardware, can I make what I feed it smarter? Where the tokens actually go I started watching where my Claude / Cursor / Copilot sessions actually spent their tokens. The surprise: most of it wasn't reasoning. It was lookup.