Strip Qwen3.6 dense of its multimodal capabilities

This may be naive but if we stripped a model of its image processing/voice processing capabilities, can it make it smaller or faster? Is that even possible? Does it vary between MoE and dense?

If it is, why isn't it done on popular models

submitted by /u/redblood252
[link] [comments]

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top