Syntax- and Compilation-Preserving Evasion of LLM Vulnerability Detectors

arXiv:2602.00305v2 Announce Type: replace-cross Abstract: LLM-based vulnerability detectors are increasingly deployed in CI/CD security gating, yet their resilience to evasion under syntax- and compilation-preserving edits remains poorly understood. We evaluate five attack variants spanning four carrier families of behavior-preserving code transformations on a unified C/C++ benchmark ($N=5000$) and introduce Complete Resistance (CR), measuring the fraction of correctly detected vulnerabilities that withstand all attack variants. Our findings reveal a significant robustness gap: models achieving 70\%+ clean recall exhibit CR as low as 0.12\%, meaning over 87\% of detected vulnerabilities can be evaded by at least one syntax-preserving edit. Universal adversarial strings optimized on a 14B surrogate transfer effectively to black-box APIs including GPT-4o, while on-target optimization further amplifies evasion (up to 92.5\% ASR). These results indicate that clean benchmark accuracy alone is insufficient as a security guarantee for deployed vulnerability detectors.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top