Avinash Goutham Aluguvelly

When Informal Text Breaks NLI: Tokenization Failure, Distribution Shift, and Targeted Mitigations

Avinash Goutham Aluguvelly / April 21, 2026

arXiv:2604.16787v1 Announce Type: new
Abstract: We study how informal surface forms degrade NLI accuracy in ELECTRA-small (14M) and RoBERTa-large (355M) across four transforms applied to SNLI and MultiNLI: slang substitution, emoji replacement, Gen-Z …

Author name: Avinash Goutham Aluguvelly

When Informal Text Breaks NLI: Tokenization Failure, Distribution Shift, and Targeted Mitigations