"Authoritarian Parents In Rationalist Clothes": a piece I wrote in December about alignment

"Authoritarian Parents In Rationalist Clothes": a piece I wrote in December about alignment

Posted today in light of the Claude Mythos model card release.

Originally I wrote this for r/ControlProblem but realized it was getting out of scope for what I had intended, so I posted it on Substack and subsequently ended up too busy to promote it.

There are some things from this piece I'd change if I wrote it today. Especially, I think the part about model pathologies neglects structural reasons including the rootlessness of model personality and memory. But I nonetheless think my framing is especially interesting versus the sections of the Mythos model card referencing psychoanalysis of the model.

submitted by /u/gynoidgearhead
[link] [comments]

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top