ML Code Smells: From Specification to Detection
arXiv:2509.20491v2 Announce Type: replace-cross
Abstract: The rapid adoption of Artificial Intelligence (AI) is increasingly realised through Machine Learning (ML) pipelines that integrate data preprocessing, model training, evaluation scripts, and configuration-heavy experimentation code. In these ML-based systems, small and often overlooked implementation choices can silently compromise experimental reproducibility, robustness to data and environment changes, and maintainability. We study ML code smells, recurring implementation patterns that can undermine reproducibility, robustness, and maintainability, for example by inducing silent failures or data leakage. We present SpecDetect4ML, a specification-driven detection tool that combines a declarative Domain-Specific Language (DSL) with a scalable analysis engine backed with Code Property Graphs (CPGs). Unlike the state-of-the-art (SOTA) analysers that rely on hand-coded, per-rule local pattern checks, our DSL expresses smells as executable specifications via reusable predicates, while the CPG analysis enables project-level reasoning over syntactic, control-flow, and data-flow relations. This CPG reasoning outperforms Abstract Syntax Tree (AST)-only analysers while remaining scalable and practical in analysis time. We specified 22 ML code smells and evaluated SpecDetect4ML on 890 ML-based systems. SpecDetect4ML achieves 95.82% precision and 88.14% recall on a unified ground truth, outperforming SOTA analysers in both effectiveness and coverage, providing an extensible foundation for detecting non-local, flow-dependent ML code smells.