LocalLLaMA

EXAONE 4.5 released

/u/Secure_Smoke_4280 / April 9, 2026

https://huggingface.co/LGAI-EXAONE/EXAONE-4.5-33B https://huggingface.co/LGAI-EXAONE/EXAONE-4.5-33B-FP8 https://huggingface.co/LGAI-EXAONE/EXAONE-4.5-33B-GGUF submitted by /u/Secure_Smoke_4280 [link] [comments]

LocalLLaMA

compiled a list of 2500+ vision benchmarks for VLMs

/u/batatibatata / April 9, 2026

I love reading benchmark / eval papers. It's one of the best way to stay up-to-date with progress in Vision Language Models, and understand where they fall short. Vision tasks vary quite a lot from one to another. For example: vision tasks t…

LocalLLaMA

Why do companies build open source models?

/u/Excellent_Koala769 / April 8, 2026

Hello, Why do companies create open source models? They must allocate lots of resources toward this, but for what profit? If anything, doesn't it just take users off of using their paid for/proprietary models? submitted by /u/Excelle…

LocalLLaMA

What is Meta even doing right now?

/u/Ok-Internal9317 / April 8, 2026

Three years ago this sub was full of llama2 distillation discussions then llama3.2, phi3 What happened to them? Last thing I remember about llama was llama4 scout or something that didn't beat gemma, then I saw it no more 🙁 submitted by &#3…

LocalLLaMA

Gemma 4 seems to work best with high temperature for coding

/u/BigYoSpeck / April 8, 2026

I've been playing with Gemma 4 31B for coding tasks since it came out and been genuinely impressed with how capable it is. With the benchmarks putting it a little behind Qwen3.5 I didn't have high expectations, but it's honestly been perfor…

LocalLLaMA

Turbo-OCR for high-volume image and PDF processing

/u/Civil-Image5411 / April 8, 2026

I recently had to process ~940,000 PDFs. I started with the standard OCR tools, but the bottlenecking was frustrating. Even on an RTX 5090, I was seeing low speed. The Problem: PaddleOCR (the most popular open source OCR): Maxed out at ~15 img/s. GPU …

LocalLLaMA

New TTS Model: VoxCPM2

/u/foldl-li / April 8, 2026

VoxCPM2 — Three Modes of Speech Generation: 🎨 Voice Design — Create a brand-new voice 🎛️ Controllable Cloning — Clone a voice with optional style guidance 🎙️ Ultimate Cloning — Reproduce every vocal nuance through audio continuation Demo https://hug…

LocalLLaMA

Finally Abliterated Sarvam 30B and 105B!

/u/Available-Deer1723 / April 8, 2026

I abliterated Sarvam-30B and 105B – India's first multilingual MoE reasoning models – and found something interesting along the way! Reasoning models have 2 refusal circuits, not one. The <think> block and the final answer can disagree: the m…

LocalLLaMA

win, wsl or linux?

/u/mon_key_house / April 8, 2026

Guys, I'm a win user and have been for ages. On my rig I thought hell, I'll give linux a try and a few months back started the software side with win11 and wsl, since all recommendations were pointing towards linux. Fast forward 4 months of slu…

LocalLLaMA

Just bought a DGX Spark, what kind of VLMs are you guys running on this kind of hardware?

/u/gymho69 / April 8, 2026

We recently purchased a DGX Spark with 128 GB RAM to run multimodal LLMs. I wanted to hear from people as to how they are getting the best of this kind of hardware. submitted by /u/gymho69 [link] [comments]