Litespark Inference on Consumer CPUs: Custom SIMD Kernels for Ternary Neural Networks

ArXi:2605.06485v1 Announce Type: new Large language models (LLMs) have transformed artificial intelligence, but their computational requirements remain prohibitive for most users. Standard inference demands expensive datacenter GPUs or cloud API access, leaving over one billion personal computers underutilized for AI workloads. Ternary models offer a path forward: their weights are constrained to {-1, 0, +1}, theoretically eliminating the need for floating-point multiplication.