MachineLearning

Disillusionment with mechanistic interpretability research [D]

Hey all, apologies if this is the wrong place to post this. I'm currently an undergrad computer scientist that got swept up in the mechanistic interpretability wave c. 2024 or so (sparse autoencoders, attribution graphs) and found it generally prom…