LocalLLaMA

ggml: add Q1_0 1-bit quantization support (CPU) – 1-bit Bonsai models

/u/pmttyji / April 6, 2026

Bonsai's 8B model is just 1.15GB so CPU alone is more than enough. https://huggingface.co/collections/prism-ml/bonsai submitted by /u/pmttyji [link] [comments]

LocalLLaMA

llama.cpp – llama-bench: add `-fitc` and `-fitt` to arguments

/u/pmttyji / April 6, 2026

Was expecting this for sometime. This is available b8679 onwards. submitted by /u/pmttyji [link] [comments]

LocalLLaMA

I benchmarked 37 LLMs on MacBook Air M5 32GB — full results + open-source tool to benchmark your own Mac

/u/evoura / April 6, 2026

So I got curious about how fast different models actually run on my M5 Air (32GB, 10 CPU/10 GPU). Instead of just testing one or two, I went through 37 models across 10 different families and recorded everything using llama-bench with Q4_K_M quantizati…

LocalLLaMA

Minimax 2.7: good news!

/u/LegacyRemaster / April 6, 2026

Updated 2 hours ago. Thanks to Yuanhe134 for the clarification. We're eagerly awaiting this update because we know how important this model is to the community. submitted by /u/LegacyRemaster [link] [comments]

LocalLLaMA

MiniMax-M2.7 …. this weekend for sure

/u/pmttyji / April 6, 2026

Sorry to all OOS developers. I underestimated the workload required for open-sourcing. We still have some infrastructure adaptation work in progress. M2.7 is expected to be released this weekend. Thank you for your understanding. submitt…

LocalLLaMA

What it took to launch Google DeepMind’s Gemma 4

/u/jacek2023 / April 6, 2026

💎💎💎💎 submitted by /u/jacek2023 [link] [comments]

LocalLLaMA

Meta to open source versions of its next AI models

/u/abkibaarnsit / April 6, 2026

submitted by /u/abkibaarnsit [link] [comments]

LocalLLaMA

I vibecoded a skill that makes LLMs stop making mistakes

/u/Mr_BETADINE / April 6, 2026

i noticed everyone around me was manually typing "make no mistakes" towards the end of their cursor prompts. to fix this un-optimized workflow, i built "make-no-mistakes" its 2026, ditch manual, adopt automation https://github…

LocalLLaMA

daVinci-LLM-3B

/u/Aaaaaaaaaeeeee / April 6, 2026

– https://huggingface.co/SII-GAIR-NLP/davinci-llm-model Overview daVinci-LLM-3B is a 3B-parameter base language model presented in daVinci-LLM: Towards the Science of Pretraining. This project aims to make the pretraining process a transparent an…

LocalLLaMA

4 days on gemma 4 26b quantized, honest notes

/u/virtualunc / April 6, 2026

running it on a mac mini m4 24gb via ollama legitimately good for: structured tasks, code generation, json formatting, following specific instructions. the apache 2.0 license means you can actually ship commercial products on it where it falls apart: m…