7MB binary-weight Mamba LLM — zero floating-point at inference, runs in browser
r/LocalLLaMA
•
Generative AI
57M params, fully binary {-1,+1}, state space model. The C runtime doesn't include math.h - every operation is integer arithmetic (XNOR, popcount, int16 accumulator for SSM state). Designed for hardware without FPU: ESP32, Cortex-M, or anything with ~8MB of memory and a CPU. Also runs in browser via WASM. Trained on TinyStories so it generates children's stories - the point isn't competing with 7B models, it's running AI where nothing else can. submitted by /u/Quiet-Error- [link] [comments.