ScoringBench: A Benchmark for Evaluating Tabular Foundation Models with Proper Scoring Rules
arXiv:2603.29928v1 Announce Type: new
Abstract: Tabular foundation models such as TabPFN and TabICL already produce full predictive distributions yet prevailing regression benchmarks evaluate them almost exclusively via point estimate metrics RMSE R2 …