Q: Does DFlash (and PFlash) work with Heretic models?
Z-Lab did some good work with speeding up output, while Luce managed to use smaller models of the same family to accelerate prefill… Since Heretic and other "smart ablation" tools can decensor a model, would they work with these multi-model…