I built a CLI to stop local AI models from eating my disk twice — lmm
r/LocalLLaMA
•
Generative AI
AI Hardware
Open Source AI
Every tool (LM Studio, Ollama, llama.cpp) downloads models to its own directory. Same 8GB model × 3 tools = 24GB wasted. lmm uses HF Cache as a single and symlinks models to each tool. Download once, use everywhere. brew tap holotherapper/tap && brew install lmm Interactive search + install from HF s MLX, GGUF, safetensors Works with LM Studio, llama.cpp, Jan, ComfyUI, etc. Adopt existing HF Cache models without re-downloading GitHub: Built in Rust, Apple Silicon only Apple Silicon and Linux. Feedback welcome. submitted by /u/holotherapper [link] [comments.