Author name: Arjun Khandelwal

Uncategorised

To what extent is Qwen3-32B predicting its persona?

TL;DRWe test to what extent Qwen3-32B behaves as though it is trying to predict what “Qwen3” would do. We do this by using Synthetic Document Finetuning (SDF) to instill meta-beliefs of the form “Qwen3 believes X, even though X is false”, then check w…

Scroll to Top