I cut Claude Code's token usage by 68.5% by giving agents their own OS

r/artificial
Generative AI AI Research

Al agents are running on infrastructure built for humans. Every state check runs 9 shell commands. Every cold start re-discovers context from scratch. It's wasteful by design. An agentic JSON-native OS fixes it. Benchmarks across 5 real scenarios: Semantic search vs grep + cat: 91% fewer tokens Agent pickup vs cold log parsing: 83% fewer tokens State polling vs shell commands: 57% fewer tokens Overall: 68.5% reduction Benchmark is fully reproducible: python3 tools/ bench_compare.py Plugs into Claude Code via MCP, runs local inference through Ollama, MIT licensed.