ggml: add Q1_0 1-bit quantization support (CPU) – 1-bit Bonsai models
Bonsai's 8B model is just 1.15GB so CPU alone is more than enough. https://huggingface.co/collections/prism-ml/bonsai submitted by /u/pmttyji [link] [comments]
Bonsai's 8B model is just 1.15GB so CPU alone is more than enough. https://huggingface.co/collections/prism-ml/bonsai submitted by /u/pmttyji [link] [comments]
Was expecting this for sometime. This is available b8679 onwards. submitted by /u/pmttyji [link] [comments]
So I got curious about how fast different models actually run on my M5 Air (32GB, 10 CPU/10 GPU). Instead of just testing one or two, I went through 37 models across 10 different families and recorded everything using llama-bench with Q4_K_M quantizati…
Updated 2 hours ago. Thanks to Yuanhe134 for the clarification. We're eagerly awaiting this update because we know how important this model is to the community. submitted by /u/LegacyRemaster [link] [comments]
Sorry to all OOS developers. I underestimated the workload required for open-sourcing. We still have some infrastructure adaptation work in progress. M2.7 is expected to be released this weekend. Thank you for your understanding. submitt…
💎💎💎💎 submitted by /u/jacek2023 [link] [comments]
submitted by /u/abkibaarnsit [link] [comments]
i noticed everyone around me was manually typing "make no mistakes" towards the end of their cursor prompts. to fix this un-optimized workflow, i built "make-no-mistakes" its 2026, ditch manual, adopt automation https://github…
– https://huggingface.co/SII-GAIR-NLP/davinci-llm-model Overview daVinci-LLM-3B is a 3B-parameter base language model presented in daVinci-LLM: Towards the Science of Pretraining. This project aims to make the pretraining process a transparent an…
running it on a mac mini m4 24gb via ollama legitimately good for: structured tasks, code generation, json formatting, following specific instructions. the apache 2.0 license means you can actually ship commercial products on it where it falls apart: m…