Xu Zheng, Feiyu Wu, Linhong Wu, Zhuocheng Wang, Hui Li

SCARV: Structure-Constrained Aggregation for Stable Sample Ranking in Redundant NLP Datasets

Xu Zheng, Feiyu Wu, Linhong Wu, Zhuocheng Wang, Hui Li / May 5, 2026

arXiv:2605.00944v1 Announce Type: cross
Abstract: Sample-level rankings are increasingly used in data-centric NLP for analysis, filtering, debugging, and curation, yet existing pipelines typically score training examples pointwise and rank them as if …

Author name: Xu Zheng, Feiyu Wu, Linhong Wu, Zhuocheng Wang, Hui Li

SCARV: Structure-Constrained Aggregation for Stable Sample Ranking in Redundant NLP Datasets