da4ml: Distributed Arithmetic for Real-time Neural Networks on FPGAs
arXiv:2507.04535v2 Announce Type: replace-cross
Abstract: Neural networks with a latency requirement on the order of microseconds, like the ones used at the CERN Large Hadron Collider, are typically deployed on FPGAs fully unrolled and pipelined. A bo…