LocalLLaMA

Intel Arc Pro B70 32GB performance on Qwen3.5-27B@Q4

Posted something when I initially got the GPU on r/IntelArc. Did not have vllm working at the time, so no real use case numbers. After many nights fighting with vllm, I finally got it to work. Here are some summery. both llama.cpp and llm-scaler-vllm …