Gemma 4 actually running usable on an Android phone (not llama.cpp)

r/artificial
Generative AI Open Source AI

I wanted a real local assistant on my phone, not a. First tried the usual llama.cpp in Termux - Gemma 4 was 2-3 tok/s and the was on fire. Then I switched to Google’s LiteRT setup, got Gemma 4 running smoothly, and wired it into an agent stack running in Termux. Now one Android is: running the LLM locally automating its own apps via ADB staying offline if I want Happy to share details + code and hear what else you’d build on top of this. submitted by /u/GeeekyMD [link] [comments]