Intel B70 with Qwen3.5 35B

By /u/Fmstrat / April 5, 2026

Intel recently released support for Qwen3.5: https://github.com/intel/llm-scaler/releases/tag/vllm-0.14.0-b8.1

Anyone with a B70 willing to run a lllama benchy with the below settings on the 35B model?

uvx llama-benchy --base-url $URL --model $MODEL --depth 0 --pp 2048 --tg 512 --concurrency 1 --runs 3 --latency-mode generation --no-cache --save-total-throughput-timeseries

submitted by /u/Fmstrat
[link] [comments]

Leave a Comment