How fast is 10 tokens per second really?
Neat little HTML app by Mike Veerman (source code here) which simulates LLM token output speeds from 5/second to 800/second.Useful if you see a model advertised as "30 tokens/second" and want to get a feel for what that actually looks like.
Via Hacker News
Tags: ai, generative-ai, llms