cs.AI, cs.CL

Can We Locate and Prevent Stereotypes in LLMs?

arXiv:2604.19764v1 Announce Type: new
Abstract: Stereotypes in large language models (LLMs) can perpetuate harmful societal biases. Despite the widespread use of models, little is known about where these biases reside in the neural network. This study…