Share your speculative settings for llama.cpp and Gemma4
I have totally missed the boat on speculative decoding. Today when generating some code again for the frontend i found myself staring down at some quite monotonic javascript code. I decided to give a go at the speculative decoding settings of llama.cpp…