Using Base-LCM to Monitor LLMs
Epistemic status: experimental results. This is an exploratory work examining an alternative approach to the interpretation of language models.SummaryWe aim to determine whether the LCM model — which predicts sentence embeddings rather than token embed…