llama.cpp – llama-bench: add `-fitc` and `-fitt` to arguments
Was expecting this for sometime. This is available b8679 onwards. submitted by /u/pmttyji [link] [comments]
Was expecting this for sometime. This is available b8679 onwards. submitted by /u/pmttyji [link] [comments]
So I got curious about how fast different models actually run on my M5 Air (32GB, 10 CPU/10 GPU). Instead of just testing one or two, I went through 37 models across 10 different families and recorded everything using llama-bench with Q4_K_M quantizati…
Updated 2 hours ago. Thanks to Yuanhe134 for the clarification. We're eagerly awaiting this update because we know how important this model is to the community. submitted by /u/LegacyRemaster [link] [comments]
Sorry to all OOS developers. I underestimated the workload required for open-sourcing. We still have some infrastructure adaptation work in progress. M2.7 is expected to be released this weekend. Thank you for your understanding. submitt…
💎💎💎💎 submitted by /u/jacek2023 [link] [comments]
submitted by /u/abkibaarnsit [link] [comments]
i noticed everyone around me was manually typing "make no mistakes" towards the end of their cursor prompts. to fix this un-optimized workflow, i built "make-no-mistakes" its 2026, ditch manual, adopt automation https://github…
– https://huggingface.co/SII-GAIR-NLP/davinci-llm-model Overview daVinci-LLM-3B is a 3B-parameter base language model presented in daVinci-LLM: Towards the Science of Pretraining. This project aims to make the pretraining process a transparent an…
running it on a mac mini m4 24gb via ollama legitimately good for: structured tasks, code generation, json formatting, following specific instructions. the apache 2.0 license means you can actually ship commercial products on it where it falls apart: m…
Quick specs, this is a workstation that was morphed into something LocalLLaMa friendly over time: 3950x 96GB DDR4 (dual channel, running at 3000mhz) w6800 + Rx6800 (48GB of VRAM at ~512GB/s) most tests done with ~20k context; kv-cache at q8_0 llama cp…