DenoGrad: A Gradient-Based Framework for Data Refinement in Tabular and Time-Series Learning
arXiv:2511.10161v2 Announce Type: replace
Abstract: In the Data-Centric Artificial Intelligence (AI) paradigm, improving data quality is essential for robust machine learning. However, many denoising methods rely on rigid statistical assumptions or require clean reference data, which limits their applicability in real-world scenarios. In this work, we propose DenoGrad, a gradient-based framework for data refinement that leverages a pretrained neural network to iteratively correct noisy observations by optimizing the input space while keeping the model fixed. DenoGrad is applicable to both tabular regression and time-series forecasting, and incorporates a consensus-based strategy to ensure temporally coherent updates in sequential settings. Experiments on ten real-world datasets show that the proposed approach yields consistent improvements in downstream predictive performance while preserving the statistical structure of the data, as measured by distributional and correlation-based metrics. In addition, DenoGrad can improve generalization in nominally clean datasets, acting as a form of dataset-level regularization. These results support model-guided data refinement as a practical component of data-centric machine learning workflows. Code is available at: https://github.com/ari-dasci/S-DenoGrad.