THEIA: Learning Complete Kleene Three-Valued Logic in a Pure-Neural Modular Architecture

arXiv:2604.11284v3 Announce Type: replace Abstract: We present THEIA, a 2.75M-parameter modular neural architecture that learns complete Kleene three-valued logic (K3) from task data without external symbolic inference or hand-encoded K3 gate primitives. THEIA achieves all 39 K3 rules at >99% per-rule accuracy across 5 seeds. K3 learnability itself is not the central finding: Transformer baselines also reach >99% on all 39 rules, and flat MLPs match THEIA on Phase-1 accuracy within 0.04pp. The central findings are two properties of the learned system. (1) Uncertainty-verdict asymmetric propagation. The network preserves Has-Unknown at every upstream boundary (80.0%/91.1%/90.8%/99.7% across Arith/Order/Set/Logic vs. ~52% majority baseline) while final-verdict decodability stays at or below a 73.4% U-vs-non-U oracle reference at every upstream boundary under both linear and 2-hidden-layer MLP probes. Activation patching on non-absorbent T->U configurations flips 4,898/4,898 OR pairs (4,719/4,719 AND) across 5 seeds, ruling out residual-shortcut explanations. (2) Reliability spectrum under discretized end-to-end training. A mod-3 sequential composition task generalizes from 5-step training to 500-step evaluation at 99.96%+/-0.04% (5 THEIA seeds). Under the identical Gumbel-softmax protocol, flat MLPs collapse to chance by 50 steps; a 2x2 ResMLP depth-x-expansion grid reaches >=99% on only 3/20 (config,seed) trials; a Transformer reaches 99.24%+/-0.34%. THEIA's cross-seed std is ~9x tighter than the Transformer's and 40x-520x tighter than tested ResMLP configs. Auxiliary: under matched optimizer settings THEIA reaches 12/12 Kleene coverage 6.5x faster than a parameter-comparable Transformer; under Transformer-standard tuning this narrows to ~3.6x, and a Transformer-recipe-applied-to-both partial control yields 4.93x (95% CI [4.40, 5.66]).

Leave a Comment