DiffusionLLM – Inception Mercury 2 – 11,000 tokens per second on NVIDIA H100 GPUs.
https://podcasts.apple.com/au/podcast/the-race-to-production-grade-diffusion-llms-with/id1116303051?i=1000757597310 https://twimlai.com/podcast/twimlai/race-production-grade-diffusion-llms https://www.inceptionlabs.ai/ submitted by /u/Rev…