Profit-Aligned CATE Estimation: Reconciling Policy Learning and Inference
arXiv:2512.13400v2 Announce Type: replace-cross
Abstract: We propose a framework that aligns Conditional Average Treatment Effect (CATE) estimation with profit maximization. Our method recognizes that, for customers with extreme treatment effects, additional estimation accuracy is unlikely to change the recommended actions. In contrast, accuracy is critical near the decision boundary, where treatment effects are close to treatment costs. Our approach optimizes a novel objective function that concentrates learning capacity along this boundary. The proposed objective is Fisher consistent with respect to the original profit function and yields a consistent estimator for CATEs. Theoretically, our framework unifies standard plug-in optimization and direct policy optimization as limiting cases of the same optimization problem. We further show that entropy-regularized policy optimization is a special case of our framework. This result has a direct practical implication: firms can recover consistent CATE estimates from existing profit-maximization pipelines. We use synthetic data to demonstrate how the proposed framework allows firms to explicitly navigate the trade-off between global prediction accuracy and profit maximization.