Decoding-Time Debiasing via Process Reward Models: From Controlled Fill-in to Open-Ended Generation
arXiv:2605.02348v1 Announce Type: cross
Abstract: Large language models pick up social biases from the data they are trained on and carry those biases into downstream applications, often reinforcing stereotypes around gender, race, religion, disabilit…