Author name: /u/reto-wyss

nvidia/Gemma-4-26B-A4B-NVFP4

/u/reto-wyss / May 1, 2026

Can confirm it works on a 5090, with 80% allocation (of 32gb) I got around 50k context. It's 18.8GB Benchmark Baseline (Full Precision) NVFP4 GPQA Diamond 80.30% 79.90% AIME 2025 88.95% 90.00% MMLU Pro 85.00% 84.80% LiveCodeBench…

LocalLLaMA

Qwen3.6-27b builds a chat interface for Gemma-4-E4B (Text, Image, Audio)

/u/reto-wyss / April 23, 2026

Qwen3.5-27b (BF16) on 2x Pro 6k and Gemma-4-E4B (BF16) on RTX 5090 Took about 8 minutes total (40k tokens total – but like 10k is opencode prompt) One prompt for planning (I answered a few follow ups) One shot 1000 lines of code Fixed only bug (…