Author name: Krzysztof Fonal

Cross-Family Speculative Decoding for Polish Language Models on Apple~Silicon: An Empirical Evaluation of Bielik~11B with UAG-Extended MLX-LM

Krzysztof Fonal / April 22, 2026

arXiv:2604.16368v2 Announce Type: replace
Abstract: Speculative decoding accelerates LLM inference by using a small draft model to propose k candidate tokens for a target model to verify. While effective for same-tokenizer pairs on high-bandwidth GPUs…

cs.CL

Cross-Family Speculative Decoding for Polish Language Models on Apple~Silicon: An Empirical Evaluation of Bielik~11B with UAG-Extended MLX-LM

Krzysztof Fonal / April 21, 2026

arXiv:2604.16368v1 Announce Type: new
Abstract: Speculative decoding accelerates LLM inference by using a small draft model to propose k candidate tokens for a target model to verify. While effective for same-tokenizer pairs on high-bandwidth GPUs, it…