DEEP-GAP: Deep-learning Evaluation of Execution Parallelism in GPU Architectural Performance
arXiv:2604.14552v1 Announce Type: cross
Abstract: Modern datacenters increasingly rely on low-power, single-slot inference accelerators to balance performance, energy efficiency, and rack density constraints. The NVIDIA T4 GPU has become widely deploy…