llama.cpp build b8338 adds OpenVINO backend + NPU support for prefill + kvcache

r/LocalLLaMA
Generative AI AI Hardware Open Source AI

Lots of work done by the Intel team, I'm looking forward to trying this out on the 255H with the Arc 140T iGPU submitted by /u/stormy1one [link] [comments]