Nakyung Lee, Sangwoo Hong, Jungwoo Lee

Efficient Process Reward Modeling via Contrastive Mutual Information

Nakyung Lee, Sangwoo Hong, Jungwoo Lee / April 14, 2026

arXiv:2604.10660v1 Announce Type: new
Abstract: Recent research has devoted considerable effort to verifying the intermediate reasoning steps of chain-of-thought (CoT) trajectories using process reward models (PRMs) and other verifier models. However,…

Author name: Nakyung Lee, Sangwoo Hong, Jungwoo Lee

Efficient Process Reward Modeling via Contrastive Mutual Information