What do you consider to be the minimum performance (t/s) for local Agent workflows?
What would you say is the minimum amount of tokens per second you would tolerate for your local agent workflows? I have been trying pi.dev connected to a llama.cpp instance running Qwen3.6-27B-Q6_K_L with 200K context running on an RTX A6000. I get ab…