Irina Proskurina, Marc-Antoine Carpentier, Julien Velcin

HatePrototypes: Interpretable and Transferable Representations for Implicit and Explicit Hate Speech Detection

Irina Proskurina, Marc-Antoine Carpentier, Julien Velcin / April 7, 2026

arXiv:2511.06391v3 Announce Type: replace
Abstract: Optimization of offensive content moderation models for different types of hateful messages is typically achieved through continued pre-training or fine-tuning on new hate speech benchmarks. However,…

Author name: Irina Proskurina, Marc-Antoine Carpentier, Julien Velcin

HatePrototypes: Interpretable and Transferable Representations for Implicit and Explicit Hate Speech Detection