Introducing the O-Value: A Universal Standardization for Confusion-Matrix-Based Classification Performance Metrics
arXiv:2505.07033v2 Announce Type: replace
Abstract: Many classification performance metrics exist, each suited to a specific application. However, these metrics often differ in scale and can exhibit varying sensitivity to class imbalance rates in the test set. As a result, it is difficult to use the nominal values of these metrics to evaluate, compare and monitor classification performances, especially when imbalance rates vary. To address this problem, we introduce the outperformance standardization (OPS) function, a universal standardization method for confusion-matrix-based classification performance (CMBCP) metrics. It maps any given metric to a common scale of $[0,1]$, while providing a clear and consistent interpretation. Specifically, the resulting OPS value (o-value) represents the percentile rank of the observed classification performance within a reference distribution of possible performances. This unified framework enables meaningful comparison and monitoring of classification performance across test sets with differing imbalance rates. We illustrate how o-values can be applied to a variety of commonly used classification performance metrics and demonstrate the utility and robustness of our method through experiments on real-world datasets spanning multiple classification applications.