cs.CL, cs.LG

Scratchpad Patching: Decoupling Compute from Patch Size in Byte-Level Language Models

arXiv:2605.09630v1 Announce Type: new
Abstract: Tokenizer-free language models eliminate the tokenizer step of the language modeling pipeline by operating directly on bytes; patch-based variants further aggregate contiguous byte spans into patches for…