I’ve created the fastest local AI engine for Apple Silicon. Optimised for agentic use.

By /u/TomatilloPutrid3939 / May 8, 2026

I've created the fastest local AI engine for Apple Silicon. Optimised for agentic use.

https://preview.redd.it/p0rqofxvrtzg1.png?width=1460&format=png&auto=webp&s=8ce5b18b4ddaad9b71f71fd8eb623839fc9c6c8b

For weeks I've been working on creating the fastest local AI engine for Apple Silicon... And I finally did!

It's optimized for agentic use. focused specifically on coding agents, tool calling, and short-turn workflows.

Repo: https://github.com/samuelfaj/lightning-mlx

A few results from my Macbook Max M5 (128gb):

Qwen3.6-27B 40.67 tok/s
Qwen3.6-35B-A3B 220.86 tok/s

I’d appreciate feedback on:

Better benchmark designs for local coding agents
Whether the MTPLX preset defaults make sense
Other Apple Silicon setups I should test

submitted by /u/TomatilloPutrid3939
[link] [comments]

Leave a Comment