AI RESEARCH
Uncertainty Makes It Stable: Curiosity-Driven Quantized Mixture-of-Experts
arXiv CS.LG
•
ArXi:2511.11743v3 Announce Type: replace Deploying deep neural networks on resource-constrained devices faces two critical challenges: maintaining accuracy under aggressive quantization while ensuring predictable inference latency. We present a curiosity-driven quantized Mixture-of-Experts framework that addresses both through Bayesian epistemic uncertainty-based routing across heterogeneous experts (BitNet ternary, 1-16 bit BitLinear, post-