I've created the fastest local AI engine for Apple Silicon. Optimised for agentic use.

r/LocalLLaMA
Generative AI AI Hardware AI Research

For weeks I've been working on creating the fastest local AI engine for Apple Silicon. And I finally did! It's optimized for agentic use. focused specifically on coding agents, tool calling, and short-turn workflows. Repo: A few results from my Macbook Max M5 (128gb): Qwen3.6-27B 40.67 tok/s Qwen3.6-35B-A3B 220.86 tok/s I’d appreciate feedback on: Better benchmark designs for local coding agents Whether the MTPLX preset defaults make sense Other Apple Silicon setups I should test submitted by /u/TomatilloPutrid3939 [link] [comments.