Stop benchmarking inference providers, a guide to easy evaluation
Hey ! Nathan from huggingface here, i maintained the Open LLM Leaderboard and in that time, I’ve evaluated around 10k model. I think there’s a pretty big misconception in how people benchmark LLMs. Most setups I see rely on inference providers like Ope…