Aligning Inductive Bias for Data-Efficient Generalization in State Space Models

arXiv:2509.20789v4 Announce Type: replace Abstract: The remarkable success of modern AI has been closely tied to scaling laws, yet the finite supply of high-quality data makes data efficiency--learning more from less--an increasingly important frontier. A model's inductive bias is a critical lever for data efficiency, but foundational sequence models such as State Space Models (SSMs) often rely on fixed, task-agnostic biases. When this fixed prior is misaligned with the underlying structure of a task, the model may require additional samples to overcome its own bias before learning the relevant signal. In this work, we introduce a principled framework for understanding and aligning the inductive bias of linear time-invariant SSMs. We first formalize this bias through an SSM-induced kernel and show theoretically and empirically that its spectrum is governed by the model's frequency response. This characterization motivates Task-Dependent Initialization (TDI), a fast power-spectrum matching method that aligns the initial SSM bias with the task's spectral characteristics before downstream training. Across controlled synthetic experiments, trainable one-layer SSMs, and deep SSMs on diverse real-world benchmarks, TDI can improve data-efficient generalization primarily when task-relevant spectral structure is present and the default SSM bias is spectrally mismatched. Our results provide both a theoretical lens and a practical tool for task-adaptive inductive bias, suggesting a path toward more data-efficient sequence modeling.

Leave a Comment