(Rant ;)) Make your benchmarks realistic
Everybody here is posting their optimizations for running different models – thats good but make these benchmark realistic as speed is not one factor to run llm effectively. Context size is key – with agentic/coding/rag work you need to have proper ct…