Public interpretability dataset and benchmark library for a novel transformer architecture [R]

https://preview.redd.it/a1pl2d6royyg1.jpg?width=1856&format=pjpg&auto=webp&s=2eaf5fee279596882a3a6ba13157806a37497ea1

I've spent the last year independently developing T³, a transformer architecture that augments standard attention with a per-head ecology grounded in Clifford algebra. Wanted to get the emperical record out in public (been alot of fun work).

t3atlas.dev hosts:

247 inference traces across 12 architectural lineages and 3 foundation-model substrates (GPT-2, Gemma3, Qwen2.5)

Documented stable schema with versioning

~990 benchmark measurements with same-data baselines run through a single canonical eval harness

Pareto frontier visualizations per task

Tier-marked dataset distinguishing canonical results from probable / archival

Stand-out numbers would be T³ at 124M parameters trained on ~500M tokens shows +6 to +10pp over same-data vanilla GPT-2 124M at ~10× less compute on compositional reasoning benchmarks (HellaSwag, ARC-C, WinoGrande, BoolQ). Roughly tied on knowledge benchmarks (ARC-E, PIQA). The differential pattern is consistent with the architectural prediction.

This sits in the intersection of geometric algebra transformers (GATr, Versor, CliffongdNet), alternative attention architectures (Mamba, RWKV, xLSTM), and mechanistic interpretability infrastructure (SAEBench, Neuronpedia).

Built solo on consumer hardware (slow and painstakingly😭). TMLR submission with co-author Nell Watson under review.

Happy to answer questions about architecture, methodology, or the consolidation process.

Might not find it super useful to everyone but from a tooling aspect I think it could be useful in sparking ideas.

https://t3atlas.dev

submitted by /u/MirrorEthic_Anchor
[link] [comments]

Leave a Comment