|
The chat connects to Gemma-4-E4B-IT running on my workstation via vllm. Qwen had no problems getting all the OpenAI compatibility stuff right. I may keep using it over 122b-a10b (fp8) for coding, but it's not as good at more creative stuff where the 122b-a10b was an extremely good all-round balance for my setup. Let's hope they drop a 3.6 of the 122b-a10b. I like the small Gemma as well. It has strong "small model" vibes, but I can see me using it for "running errands". [link] [comments] |