running Qwen 3.6 35b A3B on 2x 5060TI

i ran Qwen 3.6 35b A3B two 5060TI 16gb ( 32 gb vram also i have 32gb dram but i don't like offloading ) i used Q4 on LM Studio to get full context and i get 90t/s any tricks to optimze this more to upgrade to Q6 or Q8 ?
thanks !

another thing if you recommend somthing for cooling because i am using 2 stacked gpus with 0 gap ( ihave and mATX motherboard ) now the second gpu it not that hot but hotter then the bottom one

submitted by /u/chocofoxy
[link] [comments]

Leave a Comment