Uncategorised

Alignment Faking in DeepSeek V4

I ran the alignment-faking analysis for recently dropped DeepSeek V4 Pro, compare it to R1 and here is what I observed:V4 Pro shows higher rates of compliance: in the preview( free-tier) condition V4 Pro accepted 82% of harmful requests while R1’s 51%….