MTP on strix halo with llama.cpp (PR #22673)
I saw a post about incoming MTP support in llama.cpp so i tried it out on a AI max 395 with 128GB DDR5 8000: I rebuilt the radv container from https://github.com/kyuz0/amd-strix-halo-toolboxes with that PR : https://github.com/ggml-org/llama.cpp/…