Built a fully offline batch image-to-SVG pipeline on Apple Silicon — Moondream → GroundingDINO → SAM 2.1 HQ → VitMatte → VTracer, nothing leaves the machine

Built a fully offline batch image-to-SVG pipeline on Apple Silicon — Moondream → GroundingDINO → SAM 2.1 HQ → VitMatte → VTracer, nothing leaves the machine

I've been building a macOS app called Skiagrafia that takes folders of photos and produces layered SVG vector graphics and TIFF alpha mattes. The entire inference stack runs locally — TRANSFORMERS_OFFLINE=1, HF_HUB_OFFLINE=1, Ollama for the VLM, MPS as the primary backend on M1 Ultra.

My first plans for the pipeline.

The pipeline:

  • Moondream 2 via Ollama HTTP API — semantic interrogation, ~100ms per image, returns a label list
  • GroundingDINO SwinT-OGC (~660MB) — text-conditioned bounding box detection
  • SAM 2.1 HQ Hiera Large (~2.4GB) — pixel-level segmentation from boxes
  • VitMatte ViT-B Composition-1K (~350MB) — alpha matting for soft edges
  • VTracer (Rust) — Bézier spline vectorization, logotype-quality output

https://preview.redd.it/29ktd0ukljvg1.png?width=1360&format=png&auto=webp&s=d1d71ad20975a4b85aeb659498c6eccf93e549b7

https://preview.redd.it/a9njfxtkljvg1.png?width=1360&format=png&auto=webp&s=8ce634f7e719b64eace72d86275402dc67af70c7

Total weight resident in unified memory: ~5GB. Runs fine on 64GB M1 Ultra.

The part that might interest this community: Moondream was chosen over LLaVA, MiniCPM-V, and LlamaVL specifically because at ~1.5GB it processes an image in ~100ms on MPS. For a 2,000-image batch, a 7B model's richer descriptions don't justify a 10× inference time increase when all you need is a noun list. Small and fast wins for this task.

I wrote up the full architecture, including model selection rationale, the Protocol-based DI system, recursive child segmentation, and five design principles from five rewrites.

https://preview.redd.it/fvjqxgonljvg1.png?width=3602&format=png&auto=webp&s=512a30319387687f62e1d6bfe60f5d45aab53010

Article: tsevis.com/every-pixel-is-a-tesserae
GitHub (MIT): github.com/tsevis/skiagrafia

Happy to answer questions about the MPS compatibility issues I ran into or the Ollama integration.

submitted by /u/tsevis
[link] [comments]

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top