Intel B70: LLama.ccp SYCL vs LLama.cpp OpenVino vs LLM-Scaler

April 26, 2026 at 9:25 PM · 0 reactions · 0 comments · 3 views

In case anyone is interested, I decided to test out LLama.cpp's new OpenVino backend to see how it compares on Intel GPUs. At first glance, it stomps all over the previous best-case, SYCL, but lags behind LLM-Scaler (Intel's VLLM fork), likely just due to the hardware optimizations against GPTQ/Int4. Interestingly tg512 was fastest on SYCL, but in real world, the prompt processing always seems the be the indicator on this card. As usual with Intel, model selection is... poor. It took a while to

Original article

Read full at Reddit →

Anonymous · no account needed

Discussion

0 comments

Intel B70: LLama.ccp SYCL vs LLama.cpp OpenVino vs LLM-Scaler

Discussion

More from Reddit