Buu Phan, Ashish Khisti, Karen Ullrich

Cross-Tokenizer Likelihood Scoring Algorithms for Language Model Distillation

Buu Phan, Ashish Khisti, Karen Ullrich / May 7, 2026

arXiv:2512.14954v2 Announce Type: replace-cross
Abstract: Computing next-token likelihood ratios between two language models (LMs) is a standard task in training paradigms such as knowledge distillation. Since this requires both models to share the sa…

Author name: Buu Phan, Ashish Khisti, Karen Ullrich

Cross-Tokenizer Likelihood Scoring Algorithms for Language Model Distillation