I've seen that a lot of you use heavily quantised models with 30-something billions, sometimes even MoE, and it got me wondering: what are the real gains? (excluding privacy and the fact that it probably feels just better to actually own the infrastructure)
But in a performance way, don't you feel a gap with leading models? And how do you feel about that gap?
[ I've been a member of this sub for quite a bit and I admire the pure passion that you guys express from your posts, hopefully in not too much I'll have the possibility to have a personal setup. ]
[link] [comments]