AI RESEARCH
RAMP: Reinforcement Adaptive Mixed Precision Quantization for Efficient On Device LLM Inference
arXiv CS.LG
•
ArXi:2603.17891v1 Announce Type: new