What is your actual local LLM stack right now?
I keep trying new models, but the bigger difference usually comes from the setup around them, not the model itself. Backend frontend RAG or no RAG quant choice GPU offload context settings prompt format whatever janky glue holds it together A lot of lo…