Sho Takase, Ukyo Honda

Toward LLMs Beyond English-Centric Development

Sho Takase, Ukyo Honda / May 18, 2026

arXiv:2605.15613v1 Announce Type: new
Abstract: Through an analysis of sequences generated by open-weight large language models (LLMs), we demonstrate that LLMs are heavily biased toward English. While continual pre-training is commonly used to adapt …

Author name: Sho Takase, Ukyo Honda

Toward LLMs Beyond English-Centric Development