Dual-R-DETR: Resolving Query Competition with Pairwise Routing in Transformer Decoders

arXiv:2512.13876v2 Announce Type: replace Abstract: Detection Transformers (DETR) formulate object detection as a set prediction problem and enable end-to-end training without post-processing. However, object queries in DETR interact through symmetric self-attention, which enforces uniform competition among all query pairs. This often leads to inefficient query dynamics, where multiple queries converge to the same object while others fail to explore alternative regions. We propose Dual-R-DETR, a competition-aware DETR framework that explicitly regulates query interactions via pairwise routing in transformer decoders. Dual-R-DETR distinguishes query-to-query relations as either competitive or cooperative based on appearance similarity, prediction confidence, and spatial geometry. It introduces two complementary routing behaviors: suppressor routing to attenuate interactions among queries targeting the same object, and delegator routing to encourage diversification across distinct regions. These behaviors are realized through lightweight, learnable low-rank biases injected into decoder self-attention, enabling asymmetric query interactions while preserving the standard attention formulation. To ensure inference efficiency, routing biases are applied only during training using a dual-branch strategy, and inference reverts to vanilla self-attention with no additional computational cost. Extensive experiments on COCO and Cityscapes demonstrate that Dual-R-DETR consistently improves multiple DETR variants, outperforming DINO by 1.7% mAP with a ResNet-50 backbone and achieving 57.6% mAP with Swin-L under comparable settings. Code is available at https://github.com/YZk67/Dual-R-DETR.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top