(Rant ;)) Make your benchmarks realistic

Everybody here is posting their optimizations for running different models - thats good but make these benchmark realistic as speed is not one factor to run llm effectively.

Context size is key - with agentic/coding/rag work you need to have proper ctx size, so if you want to benchmark do round trip with long session or bigger context - this is how you will get a proper real life environment
If you are testing multimodal models, use this multimodal features - run bechmarking with image processing for example - this will bring more value in real world scenarios
State your specific hardware config - all cards have different variants
Benchmark also in parallel processing - with agentic work this is also important

Make your posts more usefull for community!

submitted by /u/AdamLangePL
[link] [comments]

Leave a Comment