cs.AI, cs.CL

Advancing Polish Language Modeling through Tokenizer Optimization in the Bielik v3 7B and 11B Series

arXiv:2604.10799v1 Announce Type: new
Abstract: The development of the Bielik v3 PL series, encompassing both the 7B and 11B parameter variants, represents a significant milestone in the field of language-specific large language model (LLM) optimizati…