Efficient Preimage Approximation for Neural Network Certification

arXiv:2505.22798v3 Announce Type: replace Abstract: The growing reliance on artificial intelligence in safety- and security-critical applications is raising concerns about the robustness of neural networks to erroneous or adversarial input. Certification is a methodology for ensuring model trustworthiness by providing formal guarantees on model behaviour. While most verification methods focus on worst-case analysis by bounding the network output, an alternative approach based on approximating the preimage can complement such analysis by estimating the proportion of inputs that satisfy a given specification. However, existing preimage-based methods, such as the state-of-the-art PREMAP, are limited to fully connected neural networks of moderate dimensionality. In this paper, we introduce PREMAP2, a collection of algorithmic extensions to PREMAP that enhance its scalability and efficiency through improved branching heuristics, adaptive Monte Carlo sampling, and reverse bound propagation. We further endow PREMAP2 with additional functionality such as support for non-uniform priors and confidence intervals. These advances enable the application of PREMAP2 to previously intractable settings, including real-world patch attacks against convolutional neural networks, where adversarial stickers or lighting conditions obscure parts of images. We showcase the effectiveness of our approach across several use cases, including certifying reliability, robustness, interpretability, and fairness, on domains ranging from computer vision to control tasks. Our implementation is available as open-source software.

Leave a Comment