I built a CLI to stop local AI models from eating my disk twice — lmm

I built a CLI to stop local AI models from eating my disk twice — lmm

Every tool (LM Studio, Ollama, llama.cpp) downloads models to its own directory. Same 8GB model × 3 tools = 24GB wasted.

lmm uses HF Cache as a single store and symlinks models to each tool. Download once, use everywhere.

https://reddit.com/link/1t934vi/video/zpx3dakzca0h1/player

  • brew tap holotherapper/tap && brew install lmm
  • Interactive search + install from HF
  • Supports MLX, GGUF, safetensors
  • Works with LM Studio, llama.cpp, Jan, ComfyUI, etc.
  • Adopt existing HF Cache models without re-downloading

GitHub: https://github.com/holotherapper/lmm

Built in Rust, Apple Silicon only. Feedback welcome.

submitted by /u/holotherapper
[link] [comments]

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top