I tested 14 LLMs from 0.6B to 123B. All of them get worse at following instructions when users are hostile [R]
TL;DR. Across 14 instruct-model configurations spanning Llama 3.1, Mistral, and Qwen3 from 0.6B to 123B, hostile user prompts produce a significant IFEval instruction-following degradation that replicates across architecture, quantization tier (FP16 vs…