Safety & Alignment

Adversarial training methods for semi-supervised text classification