cs.AI, cs.LG

WARP: Guaranteed Inner-Layer Repair of NLP Transformers

arXiv:2604.00938v1 Announce Type: cross
Abstract: Transformer-based NLP models remain vulnerable to adversarial perturbations, yet existing repair methods face a fundamental trade-off: gradient-based approaches offer flexibility but lack verifiability…