I thought Mnemara would save tokens for cloud based models, that was wrong.

Mnemara was built for local models. I built it for Claude too. Only one of those was a good idea. The context management problem felt real, and it was. I was running Gemma 9B locally for parts of Aethon Autopoiesis - the MUD-based AI research project I've been pouring time into - and a 16k context window doesn't last long when you're trying to hold a coherent session across a real workflow. Tool calls take space. Thinking blocks take space. Read outputs take space.