Do Transformers Use their Depth Adaptively? Evidence from a Relational Reasoning Task
arXiv:2604.12426v1 Announce Type: new
Abstract: We investigate whether transformers use their depth adaptively across tasks of increasing difficulty. Using a controlled multi-hop relational reasoning task based on family stories, where difficulty is d…