Adversarial Causal Tuning for Realistic Time-series Generation
arXiv:2506.02084v2 Announce Type: replace-cross
Abstract: We address the problem of generating simulated, yet realistic, time-series data from a causal model with the same observational and interventional distributions as a given real dataset (probabilistic causal digital twin). While non-causal models (e.g., GANs) also strive to simulate realistic data, causal models are fundamentally more powerful, able to simulate the effect of interventions (what-if scenarios), optimize decisions, perform root-cause analysis, and counterfactual causal reasoning. We introduce the Adversarial Causal Tuning (ACT) methodology, which outputs the optimal causal model that fits the data, along with a quantification of the goodness-of-fit. The returned causal model can then be employed to simulate new data or to perform other causal reasoning tasks. ACT adopts ideas from Generative Adversarial Network training and AutoML to search for optimal causal pipelines and discriminators that detect deviations between the distributions of real and simulated data. It also adapts a permutation testing procedure from established causal tuning methods to penalize models for complexity. Through extensive experiments on real, semi-synthetic, and synthetic datasets, we show that (a) employing multiple optimized discriminators is paramount for selecting the optimal causal models and quantifying goodness-of-fit, (b) ACT selects the optimal causal model in synthetic datasets while avoiding overfitting, generating data indistinguishable from the true data distribution (c) all state-of-the-art generative and causal simulation methods, exhibit room for improvement in reproducing real data distributions; generating realistic temporal data is still an open research challenge.