Research Log: Monet/PEER sparse experts
I’ve been looking into the Monet/PEER sparse expert papers. I think there’s a lot of potential in these ideas for interpretability-by-design.
Some of what I’ve done so far:
Quantization experiments: PEER can be losslessly distilled to int8 and distil…