Kirill Nagaitsev, Luka Grbcic, Samuel Williams, Costin Iancu

Optimizing PyTorch Inference with LLM-Based Multi-Agent Systems

Kirill Nagaitsev, Luka Grbcic, Samuel Williams, Costin Iancu / May 15, 2026

arXiv:2511.16964v2 Announce Type: replace-cross
Abstract: Maximizing performance on available GPU hardware is an ongoing challenge for modern AI inference systems. Traditional approaches include writing custom GPU kernels and using specialized model c…

Author name: Kirill Nagaitsev, Luka Grbcic, Samuel Williams, Costin Iancu

Optimizing PyTorch Inference with LLM-Based Multi-Agent Systems