Author name: Arsenios Scrivens

Information-Theoretic Limits of Safety Verification for Self-Improving Systems

Arsenios Scrivens / April 3, 2026

arXiv:2603.28650v2 Announce Type: replace
Abstract: Can a safety gate permit unbounded beneficial self-modification while maintaining bounded cumulative risk? We formalize this question through dual conditions — requiring sum delta_n < infinity (boun…

cs.AI, cs.LG, stat.ML

Empirical Validation of the Classification-Verification Dichotomy for AI Safety Gates

Arsenios Scrivens / April 2, 2026

arXiv:2604.00072v1 Announce Type: cross
Abstract: Can classifier-based safety gates maintain reliable oversight as AI systems improve over hundreds of iterations? We provide comprehensive empirical evidence that they cannot. On a self-improving neural…

cs.AI, cs.LG, stat.ML

Information-Theoretic Limits of Safety Verification for Self-Improving Systems

Arsenios Scrivens / March 31, 2026

arXiv:2603.28650v1 Announce Type: new
Abstract: Can a safety gate permit unbounded beneficial self-modification while maintaining bounded cumulative risk? We formalize this question through dual conditions — requiring sum delta_n < infinity (bounded …