AI RESEARCH

Uncertainty Makes It Stable: Curiosity-Driven Quantized Mixture-of-Experts

arXiv CS.LG

ArXi:2511.11743v3 Announce Type: replace Deploying deep neural networks on resource-constrained devices faces two critical challenges: maintaining accuracy under aggressive quantization while ensuring predictable inference latency. We present a curiosity-driven quantized Mixture-of-Experts framework that addresses both through Bayesian epistemic uncertainty-based routing across heterogeneous experts (BitNet ternary, 1-16 bit BitLinear, post-