Qwen 3.6 q8 at 50t/s or q4 at 112 t/s?

What are some ways that you would go about thinking about choosing between the two for use in a harness like pi?

Did a good bit with q4 yesterday and it was so consistent and reliable I had it set to 131k context and it worked through 2 compactings on a clearly defined task without messing the whole thing up. Very excited about this recent step forward.

I'm going to start working with the q8 some today but I was interested in what your impressions of the types of differences I might expect between the two.

submitted by /u/GotHereLateNameTaken
[link] [comments]

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top