I tried to benchmark TurboQuant on Android (Snapdragon 7s Gen 3) — here's what actually happened
r/LocalLLaMA
•
AI Hardware
AI Research
Building a sovereign Android de stack from a single phone. No PC. Termux-native. When TurboQuant dropped last week I immediately wanted to know: does this work on ARM CPU-only? Nobody had tested it on mobile hardware. My setup: Xiaomi Redmi Note 14 Pro+ 5G Snapdragon 7s Gen 3 (ARMv8-A, 8GB RAM) Termux native, Android 16 No GPU offload (Adreno 730 rejects Qwen3.5 Hybrid Linear Attention kernels) What I did: Built the Aaryan-Kapoor turboquant-tq3_0 branch via GitHub Actions cross-compile (can't build on-device - 8GB RAM, -j2 max). Flags: -march=armv8-a+dotprod+i8mm, CPU-only, no.