I’ve created the fastest local AI engine for Apple Silicon. Optimised for agentic use.

I've created the fastest local AI engine for Apple Silicon. Optimised for agentic use.

https://preview.redd.it/p0rqofxvrtzg1.png?width=1460&format=png&auto=webp&s=8ce5b18b4ddaad9b71f71fd8eb623839fc9c6c8b

For weeks I've been working on creating the fastest local AI engine for Apple Silicon... And I finally did!

It's optimized for agentic use. focused specifically on coding agents, tool calling, and short-turn workflows.

Repo: https://github.com/samuelfaj/lightning-mlx

A few results from my Macbook Max M5 (128gb):

  • Qwen3.6-27B 40.67 tok/s
  • Qwen3.6-35B-A3B 220.86 tok/s

I’d appreciate feedback on:

  1. Better benchmark designs for local coding agents
  2. Whether the MTPLX preset defaults make sense
  3. Other Apple Silicon setups I should test
submitted by /u/TomatilloPutrid3939
[link] [comments]

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top