Turboquant on llama.cpp for Metal using Rust

r/LocalLLaMA
Generative AI AI Hardware Open Source AI

Sharing my attempt to create a Rust-based simple chat TUI that takes advantage of Turboquant on llama.cpp specifically for Apple Silicon hardware. I have added chat templates for Qwen, Llama and Mistral models if you want to test Turboquant on these models. submitted by /u/J0shGamboa [link] [comments]