SPRINT: Robust Model Attribution of Generated Images via Secret Pixel Reconstruction
arXiv:2508.05691v3 Announce Type: replace-cross
Abstract: Detecting the source model of AI-generated images is a growing accountability problem. AI fingerprinting techniques address this by detecting imperceptible patterns in the images that are unique to each model, achieving high detection accuracy under ideal conditions. However, recent research has shown that image fingerprints are extremely brittle to adaptive attacks, where knowledge of the technique can be exploited to perturb the fingerprints and evade detection. We present SPRINT (Secret Pixel Reconstruction fingerprinting), a novel model attribution method specifically designed to provide robustness to adaptive attacks. As opposed to existing fingerprinting, which focuses on publicly discoverable patterns in the image, SPRINT relies on a secret to define hidden reconstruction targets, thus keeping the verification task itself private. As a result, the attacker can no longer see the task that the verifier solves at verification time, protecting the information exploited by the attacks. Our results show that SPRINT achieves high closed-world accuracy while remaining robust to adaptive attacks: on the FFHQ dataset, SPRINT reaches 99.17% clean accuracy on a diverse 12-model pool and 98.83% on a harder pool of 6 close checkpoints of the same model architecture, while reducing adaptive removal and forgery attack success rates to 1% or below. When the same pool of close model checkpoints is considered an open world, SPRINT maintains high accuracy with an AUROC of 99.30%. These findings show that the approach of privatizing the verification task can make adaptive evasion substantially harder while maintaining performance in the clean setting.