i put a 0.5B LLM on a Miyoo A30 handheld. it runs entirely on-device, no internet.

SpruceChat runs Qwen2.5-0.5B on handheld gaming devices using llama.cpp. no cloud, no wifi needed. the model lives in RAM after first boot and tokens stream in one by one. runs on: Miyoo A30, Miyoo Flip, Trimui Brick, Trimui Smart Pro performance on the A30 (Cortex-A7, quad-core): - model load: ~60s first boot - generation: ~1-2 tokens/sec - prompt eval: ~3 tokens/sec it's not fast but it streams so you watch it think. 64-bit devices are quicker. the AI has the personality of a spruce tree. patient, unhurried, quietly amazed by everything.