LocalLLaMA

LocalLLaMA

Qwen 3.6 35B different quant speeds ?

https://preview.redd.it/bixb4erga2wg1.png?width=1464&format=png&auto=webp&s=2df10ab305a5cf4c4252496ec3df34422359066b This is on RTX 3090 , llama.ccp main , linux arch. So what is everybody's experience so far , ive tested a few qu…

LocalLLaMA

Qwen 3.6 CoT issue?

So the Qwen vocab has distinct tokens for <think> and </think>. I know this because an app I wrote pushes those tokens to the cache after <|im_start|>assistant to stop CoT selectively. Great. Yesterday I was fucking around with some c…

Scroll to Top