LLM's on Android (Snapdragon 8 Elite) MOE Experience

r/LocalLLaMA
Generative AI AI Hardware

So I bought a with Snapdragon 8 elite (gen 4) and 24GB ram (Honor magic 7 pro). My experience has been mixed but with solid potential. Hexagon (Snapdragon 8 Elite) NPU and OpenclGPU and updates have been rolling in fast but still the fastest prompt processing and token generation have mostly been CPU (I would bet that soon enough either NPU or GPU will be faster or realistically both). CPU has the downside of generating heat than NPU and GPU inference but overall it's still the fastest currently.