/u/Simple_Library_2700

Acceptable prompt processing speed for you?

/u/Simple_Library_2700 / April 19, 2026

I am currently optimising some ancient hardware to run qwen3 (4xV100s) but the lack of flash attention means that at longer contexts the processing starts to really slow down. For agentic coding work what processing speeds and contexts lengths d…

Author name: /u/Simple_Library_2700

Acceptable prompt processing speed for you?