Learning-Deep-Learning

FlashDrive: Flash Vision-Language-Action Inference For Autonomous Driving

April 2026

tl;dr: Accelerating VLA model in streaming fashion.

Overall impression

The paper broke down the inference pipeline into four parts: vision ecnoder, prefill, decode and trajectory decode.

Key ideas

Technical details

Notes