Built a fully offline suitcase robot around a Jetson Orin NX SUPER 16GB. Gemma 4 E4B, ~200ms cached TTFT, 30+ sensors, no WiFi/BT/cellular. He has opinions.

r/LocalLLaMA
Generative AI Computer Vision Robotics Open Source AI

Sparky runs entirely on the Jetson. Gemma 4 E4B at Q4_K_M via llama.cpp with q8_0 KV cache and flash attention. 12K context, native system role, sampler defaults from the model card. Cached TTFT around 200ms, sustained 14-15 tok/s. SenseVoiceSmall for STT, Piper for TTS with 43Hz mouth sync, PixiJS face on the lid display. Vision and OCR are native to Gemma 4 now so the BLIP subprocess is gone. 30+ sensors fold into the prompt as natural language every turn. One of the biggest wins was prompt structure for cache stability.