GGUF with MTP vs MLX without. Is mlx still the way to go for mac users?
r/LocalLLaMA
•
Generative AI
Has anyone of the mac users tested the speed difference (token gen, promt processing) between mlx quants without mtp, vs gguf quants with mtp? or less once a month I wonder if mlx is still the correct path in mac. Some reasons: - LM Studio has bad caching for mlx. And not MTP of