Hitoku, open-source local macOS context aware assistant with Qwen3.5/Gemma4

Hi all,

I've been building Hitoku. An open-source, voice-first AI assistant that runs entirely locally. No cloud models, nothing leaves your machine.

It supports Gemma 4 and Qwen 3.5 for text generation, plus multiple STT backends (Parakeet, Whisper, Qwen3-ASR).

It's context-aware; it reads your screen, documents, and active app to understand what you're working on. You can ask about PDFs, reply to emails, create calendar events, use web search, all by voice.

Examples:

- query a pdf document, https://www.youtube.com/watch?v=ggaDhut7FnU

- reply to email, https://www.youtube.com/watch?v=QFnHXMBp1gA

- and with ctrl+S is just voice dictation (with optional polishing)

I currently use it a lot with Claude Code, Obsidian, notes, as well as to read papers, or to some write emails (where I do not need to provide context, as it understands alone).

Code: https://github.com/Saladino93/hitokudraft/tree/litert

Download: https://hitoku.me/draft/ (free with code HITOKULANG, valid for 50 downloads)

P.S. Gemma 4 via LiteRT caveat

Gemma4 uses LiteRT: Currently Swift support is a bit lacking. So, the package has some wrapper around official LiteRT from Google.
- When dealing with images, Gemma4 with LiteRT is currently fast compared to the implementation I have of Qwen3.5.
Memory spikes: LiteRT's WebGPU backend can allocate significantly more GPU memory than the model weights alone. Rare but worth monitoring. (upstream issue here [https://github.com/google-ai-edge/LiteRT/issues/5706](vscode-file://vscode-app/Applications/Visual%20Studio%20Code.app/Contents/Resources/app/out/vs/code/electron-browser/workbench/workbench.html) )
App size: LiteRT dylibs add ~98 MB (app goes from ~50 MB to ~150 MB). No official Swift package from Google yet, so the dylibs are bundled manually.

If either bothers you: use Qwen 3.5 instead (pure MLX, no LiteRT needed), or wait for the upstream fixes. Working on running Gemma 4 natively via MLX (a bit slower wrt LiteRT but generally safer, and with more control).

submitted by /u/Saladino93
[link] [comments]

Leave a Comment