Realtime-VLA FLASH: Speculative Inference Framework for Diffusion-based VLAs

ArXi:2605.13778v1 Announce Type: cross Diffusion-based vision-language-action models (dVLAs) are promising for embodied intelligence but are fundamentally limited in real-time deployment by the high latency of full inference. We propose Realtime-VLA FLASH, a speculative inference framework that eliminates most full inference calls during replanning by